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FILE NO. A32000-A-072667.0172 
PATENT 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicant : Yannick Batard et al. 

Serial No. : NOT YET ASSIGNED Examiner: 

Filed : HEREWITH Group Art Unit: 

For : RECODING OF DNA SEQUENCES 

PERMITTING EXPRESSION IN YEAST 
AND OBTAINED TRANSFORMED YEAST 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Please amend the above-identified application as follows: 

IN THE SPECIFICATION : 

Page 12, lines 15-16, delete "(sequence identifier No. 1)" and substitute 
therefor -of SEQ ID NO: 1 (which encodes the amino acid sequence of SEQ ID NO: 
15)-. 
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Page 14. line 11 , after "No. 7" insert -(which encodes the amino acid 
sequence of SEQ. ID NO: 16)--. 

Page 14. line 11 , after "No. 8" insert --(which encodes the amino acid 
sequence of SEQ. ID NO: 17)-. 

Page 14, line 11 , after "No. 9" insert -(which encodes the amino acid 
sequence of SEQ ID NO: 18)-. 

Page 18. line 2 , after "No. 10" insert --, which encodes the amino acid 
sequence of SEQ ID NO: 19-. 

Page 18. line 14 , after "No. 14" insert --, which encodes the amino acid 
sequence of SEQ ID NO: 20-. 

Please delete pages 20-42 and renumber Pages 43-48 as pages 20-25. 

After page 48, please insert the attached substitute sequence listing. 

IN THE CLAIMS : 

Claim 5. lines 1-2 , delete "one of Claims 1 to 4" and substitute therefor 

—claim 1—. 

Claim 7. lines 1-2 , delete "one of claims 1 to7" and substitute therefor 

—claim 1—. 
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-claim 1~. 



--claim 1— . 



—claim 1—. 



—claim 1— . 



—claim 1— . 



-claim 1~. 



-claim 1~. 



FILE NO. A32000-A-072667.0172 
PATENT 

Claim 11. lines 1-2 , delete "one of claims 9 or 10" and substitute therefor 
Claims 12. lines 1-2. delete "one of claims 1 to 11" and substitute therefor 
Clam 13. lines 1-2 , delete "one of claims 1 to 12" and substitute therefor 



Claim 15. lines 1-2 , delete "one of claims 1 to 14" and substitute therefor 

Claim 18. lines 1-2 . delete "one of claims 1 to 17" and substitute therefor 

Claim 22. line 2 , delete "one of claims 1 to 21" and substitute therefor 

Claim 27, line 5 , delete "according to claim 23". 

Claim 27. line 6 . delete "one of claims 1 to 21" and substitute therefor 

Claim 28. line 6 . delete "according to claim 23". 

Claim 28, lines 7-8 , delete "one of claims 1 to 21" and substitute therefor 
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REMARKS 

The foregoing amendments are necessary to conform the specification to 
the Sequence Listing and to remove multiple dependencies. No new matter has been 
introduced by the foregoing amendments. 

Respectfully submitted, 

s{l4U^ l^ykCceJUsi-rJ 
rf/ouis S. Sorell 
Patent Office Reg. No. 32,439 

Attorney for Applicants 
(212) 408-2620 



Janet M. MacLeod 

Patent Office Reg. No. 35,263 

Attorney for Applicants 
(212) 408-2597 
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The recoding of DNA sequences to enable them to be 
expressed in yeasts, and the transformed yeasts 
obtained 

The present invention relates to the recoding 
of DNA sequences which encode proteins which contain 
regions having a high content of codons which are 
poorly translated by yeasts, in particular which encode 
proteins of plant origin, such as the P450 cytochromes 
of plant origin, and to their expression in yeasts. 

It is known that certain sequences encoding 
proteins of interest, in particular proteins of plant 
origin, are not readily translated in yeasts. This 
applies, in particular, to proteins which possess 
regions having a high content of codons which are 
poorly suited to yeasts, in particular leucine codons, 
such as some P4 5 0 cytochromes of plant origin. Some 
systems which have been developed for improving the 
expression of P450 cytochromes of animal or plant 
origin in yeasts, such as those described by Pompon et 
al. {Methods Enzymol . , 272, 1996, 51-64; WO 97/10344), 
have turned out to be unsuitable for large numbers of 
P45 0 cytochromes which encompass regions having a high 
content of codons which are poorly suited to yeasts . 

The P450 cytochromes constitute a superfamily 
of membrane enzymes of the monooxygenase type which are 
able to oxidize a large family of generally hydrophobic 
substrates. The reactions are most frequently 
characterized by the oxidation of C-H or C=C bonds, and 



of heteroatoms, and, more rarely, by the reduction of 
nitro groups or by dehalogenat ion . More specifically, 
these enzymes are involved in the metabolism of 
xenobiotic substances and drugs and in the biosynthesis 
of secondary metabolites in plants, some of which have 
organoleptic or pharmacodynamic properties. 

As a consequence, the P450 cytochromes are 
used, in particular, in: 

the in vitro diagnosis of the formation of 
toxic or mutagenic metabolites (molecules of natural 
origin, pollutants, drugs, pesticides, etc.), making it 
possible, in particular, to develop novel active 
molecules (pharmaceutical, agrochemistry) , 

the identification and destruction of 
molecules which are toxic for, or pollute, the 
environment, 

the enzymic synthesis of novel molecules. 
The search for heterologous expression of 
P450 cytochromes by host cells, more specifically 
yeasts, is therefore important for obtaining controlled 
production of this enzyme in large quantity, either for 
isolating it and using it in the above- listed 
processes, or for using the transformed cells directly 
for the said processes without previously isolating the 
enzyme . 

The present invention provides a solution to 
the abovementioned problem, enabling proteins which 
contain regions having a high content of codons which 



are poorly suited to yeasts, in particular P450 
cytochromes of plant origin, to be expressed in yeasts. 

The present invention therefore relates to a 
DNA sequence, in particular a cDNA sequence, which 
5 encodes a protein of interest which contains regions 

having a high content of codons which are poorly suited 
to yeasts, characterized in that a sufficient number of 
codons which are poorly suited to yeasts is replaced 
with corresponding codons which are well -suited to 

10 yeasts in the said regions having a high content of 
codons which are poorly suited to yeasts. 

Within the meaning of the present invention, 
"codons which are poorly suited to yeasts" are 
understood as being codons whose frequency of use by 

15 yeasts is less than or equal to approximately 13 per 

1000, preferably less than or equal to approximately 12 
per 1000, more preferably less than or equal to 
approximately 10 per 1000. The frequency at which 
codons are used by yeasts, more specifically by 

20 S. cerevisiae, is described, in particular, in "Codon 
usage data base from Yasukazu Nakamura" 

{http://www.dna.affrc.go.jp/~nakamura/codon.html) . This 
applies, in particular, to codons CTC, CTG and CTT, 
which encode leucine, to codons CGG, CGC, CGA, CGT and 
25 AGG, which encode arginine, to codons GCG and GCC, 

which encode alanine, to codons GGG, GGC and GGA, which 
encode glycine, and to codons CCG and CCC, which encode 
proline. The codons which are poorly suited to yeasts 



in accordance with the invention are, more 
specifically, codons CTC and CTG, which encode leucine, 
CGG, CGC, CGA, CGT and AGG, which encode arginine, 
codons GCG and GCC, which encode alanine, GGG and GGC, 
which encode glycine, and codons CCG and CCC, which 
encode proline. 

Within the meaning of the present invention, 
"corresponding codons which are well-suited to yeasts" 
are understood as being the codons which correspond to 
the codons which are poorly suited to yeasts and which 
encode the same amino acids, and whose frequency of use 
by yeasts is greater than 15 per 1000, preferably 
greater than or equal to 18 per 1000, more preferably 
greater than or equal to 20 per 1000. This applies, in 
particular, to codons TTG and TTA, preferably TTG, 
which encode leucine, to codon AGA, which encodes 
arginine, to codons GCT and GCA, preferably GCT, which 
encode alanine, to codon GGT, which encodes glycine, 
and to codon CCA, which encodes proline. 

Within the meaning of the present invention, 
"region having a high content of codons which are 
poorly suited to yeasts" is understood as being any 
region of the DNA sequence which contains at least 2 
poorly suited codons among 10 consecutive codons, with 
it being possible for the two codons to be adjacent or 
separated by up to 8 other codons. According to one 
preferred embodiment of the invention, the regions 
having a high content cf poorly suited codons contain 



5 

2, 3, 4, 5 or 6 poorly suited codons per 10 consecutive 
codons, or contain at least 2 or 3 adjacent poorly 
suited codons. 

Within the meaning of the present invention, 
5 "sufficient number of codons" is understood as being 
the number of codons which it is necessary and 
sufficient to replace in order to observe a substantial 
improvement in their expression in yeasts. 
Advantageously, at least 50% of the codons which are 

10 poorly suited to yeasts in the high-content region 
under consideration are replaced with well-suited 
codons. Preferably, at least 75% of the poorly suited 
codons of the said region are replaced, with 100% of 
the poorly suited codons more preferably being 

15 replaced. 

Within the meaning of the present invention, 
"substantial improvement" is understood as being either 
a detectable expression when no expression of the 
reference sequence is observed, or an increase in 
2 0 expression as compared with the level at which the 
reference sequence is expressed. 

Within the meaning of the present invention, 
"reference sequence" designates any sequence which 
encodes a protein of interest and which is modified in 
2 5 accordance with the invention in order to promote its 
expression in yeasts. 

The present invention is particularly well 
suited to DNA sequences, in particular cDNA sequences, 



which encode proteins of interest which contain regions 
having a high content of leucine and in which a 
sufficient number of CTC codons encoding leucine in the 
said region having a high content of leucine is 
replaced with TTG and/or TTA codons, or in which a 
sufficient number of CTC and CTG codons encoding 
leucine in the said region having a high content of 
leucine is replaced with TTG and/or TTA codons, 
preferably with a TTG codon. 

Within the meaning of the present invention, 
"region having a high content of leucine" is understood 
as being a region which contains at least 2 leucines 
among 10 consecutive amino acids in the protein of 
interest, with it being possible for the two leucines 
to be adjacent or separated by up to 8 other amino 
acids. According to one preferred embodiment of the 
invention, the regions having a high content of leucine 
contain 2, 3, 4, 5 or 6 leucines per 10 consecutive 
amino acids, or contain at least 2 or 3 adjacent 
leucines . 

According to a preferred embodiment of the 
invention, at least 50% of the CTC or CTC and CTG 
codons of the region having a high content of leucine 
are replaced with TTG or TTA codons, with at least 75% 
of the CTC or CTC and CTG codons of the said region 
preferably being replaced, and 100% of the CTC or CTC 
and CTG codons more preferably being replaced. 

Advantageously, the present invention is 



particularly suitable for DNA sequences whose general 
content of poorly suited codons is at least 20%, more 
preferably at least 3 0%, as compared with the total 
number of codons in the reference sequence. 

Advantageously, when the reference sequence 
contains at least one 5' region having a high content 
of poorly suited codons, the recoding of this 5' region 
alone makes it possible to obtain a substantial 
improvement in the expression of the protein of 
interest in yeasts. The length of the 5' region to be 
recoded in accordance with the invention will vary 
depending on the length of the region having a high 
content of poorly suited codons. This length will 
advantageously be at least four codons, in particular 
when this region contains at least two adjacent poor 
codons, up to approximately 4 0 codons or more. 

However, it is not necessary, according to 
the invention, to recode all the reference sequence, 
but only the regions having a high content of poor 
codons, in particular the 5' region on its own, in 
order to obtain a substantial improvement in the 
expression of the protein of interest in yeasts. 

Advantageously, the DNA sequence encoding a 
protein of interest is an isolated DNA sequence of 
natural origin, in particular of plant origin. The 
invention is particularly advantageous for sequences 
which originate from monocotyledonous or dicotyledonous 
plants, preferably monocotyledonous plants, in 



particular of the graminae family, such as wheat, 
barley, oats, rice, maize, sorghum, cane sugar, etc. 

According to a preferred embodiment of the 
invention, the DNA sequence encodes an enzyme, in 
particular a cytochrome P450, which is preferably of 
plant origin. These P4 50 cytochromes exhibit a high 
content of poorly suited codons, in particular encoding 
leucine, in their N-terminal region; it is in the 
5' -terminal coding region that the poorly suited codons 
are replaced. 

The present invention also relates to a 
chimeric gene which comprises a DNA sequence which has 
been modified as above and heterologous 5' and 3' 
regulatory elements which are able to function in a 
yeast, that is to say which are able to control the 
expression of the protein of interest in the yeast. 
Such regulatory elements are well known to the skilled 
person and are described, in particular, by Rozman et 
al. (Genomics, 38, 1996, 371-381) and by Nacken et al . 
(Gene, 175, 1996, 253-260, Probing the limits of 
expression levels by varying promoter strength and 
plasmid copy number in Saccharomyces cerevisiae) . 

The present invention also relates to a 
vector for transforming yeasts which contains at least 
one chimeric gene as described above. It also relates 
to a process for transforming yeasts with the said 
vector and to the transformed yeasts which are 
obtained. It finally relates to a process for producing 



a heterologous protein of interest in a transformed 
yeast, with the sequence which encodes the said protein 
of interest being such as defined above. 

The process for producing a heterologous 
protein of interest in a transformed yeast comprises 
the steps of : 

a) transforming a yeast with a vector which 
is able to replicate in yeasts and which contains a 
modified DNA sequence as defined above and heterologous 
5' and 3' regulatory elements which are able to 
function in a yeast, 

b) culturing the transformed yeast, and 

c) extracting the protein of interest from 
the yeast culture. 

When the protein of interest is an enzyme 
which is suitable for transforming a substrate, such as 
a cytochrome P450, the enzyme which has been extracted 
from the yeast culture is then used for catalysing the 
transformation of the said substrate. 

However, the catalysis can be carried out, 
without requiring the extraction of the yeast, by 
culturing the transformed yeast in the presence of the 
said substrate. 

The present invention also relates, 
therefore, to a process for transforming a substrate by 
enzymic catalysis using an enzyme which is expressed in 
a yeast, which process comprises the steps of 

a) culturing the yeast which has been 



10 

transformed in accordance with the invention in the 
presence of the substrate to be transformed, then 

b) recovering the transformed substrate from 
the yeast culture. 
5 When the yeast has been transformed for 

expressing a cytochrome P450, the reaction which is 
catalysed by the enzyme is an oxidation reaction, more 
specifically a reaction in which C-H or C=C bonds are 
oxidized. 

10 The techniques for transforming and culturing 

yeasts are known to the skilled person, and are 
described, for example, in Methods in Enzymology (Vol. 
194 , 1991) . 

Yeasts which are of use in accordance with 
15 the invention are selected, in particular, from the 

genera Sac char omy ce s , Kluyveromyces , Hansenula, Pichia 
and Yarrowia. Advantageously, the yeast belongs to the 
Saccaromyces genus, and is in particular S. cerevisiae . 

Other characteristics of the invention will 
2 0 become apparent in the light of the examples which 
follow.' 

Example 1: Production of a wheat cDNA gene library, and 
identification of the CYP73A17 sequence 

The wheat cytochrome P4 5 0 CYP73A17 sequence 
25 was obtained by screening a young wheat plantlet 

(shoots and roots without the caryopses) cDNA library 
which was constructed in the vector X-ZapII 
(Stratagene) in accordance with the supplier's 
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instructions . 

1. Production of the cDNA library 

Triticum aestivum (L. cv. Darius) seeds which 
had been coated with cloquintocet-mexyl (0.1% per dry 
5 weight of seed) are cultured in plastic boxes on two 

layers of damp gauze until shoots having a size of 3 to 
5 mm are obtained. The water in the boxes is then 
replaced with a solution of 4 mM sodium phenobarbital 
and the wheat is cultured until the shoots are 

10 approximately 1 cm in size. 

The cDNA library is constructed in the 
X-ZapII (Stratagene) vector, in accordance with the 
supplier's protocol and instructions, using 5 fj.g of 
poly (A)* RNA (Lesot, A., Benveniste, I., Hasenfratz, 

15 M.P., Durst, F. (1990) Induction of NADPH cytochrome 
P450 (c) reductase in wounded tissues from Helianthus 
tuberosus tubers. Plant Cell Physiol., 31, 1177-1182) 
which were isolated from the treated roots and shoots. 

2 . Screening the cDNA library 

2 0 5x10 s lysis plaques from the previously 

obtained X-ZapII library are screened using a probe 
which corresponds to the complete coding sequence of 
Helianthus tuberosus CYP73A1, and which has been 
labelled by random priming with [a- 32 P]dCTP. The filters 

2 5 are prehybridized and hybridized at low stringency at 
55°C in accordance with the standard protocols. The 
membranes are washed twice for 10 minutes with 2 x SSC, 
0.1% SDS, and once for 10 minutes with 0.2 x SSC, 0.1% 
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SDS at ambient temperature, then twice for 3 0 minutes 
with 0.2 x SSC, 0.1% SDS at 45°C. The inserts of the 
positive lysis plaques are analysed by PCR 
(polymerization chain reaction) and hybridization in 
5 order to determine their size. The clones containing 
inserts which hybridize with CYP73A1 under the above- 
described conditions and which are greater than 1.5 kbp 
in size are rescreened before excision of the 
pBluescript plasmid in accordance with the supplier's 

10 (Stratagene) protocol and sequencing using the Ready 
Reaction Dye Deoxy Terminator Cycle prism technique 
developed by Applied Biosystems Inc. A full length 
clone is then identified by alignment with CYP73A1. 

The wheat cytochrome P450 CYP73A17 which is 

15 encoded by the isolated sequence (sequence identifier 
No. 1) exhibits 76.2% identity with the Helianthus 
tuberosus CYP73A1 . 

Example 2: Alterations to the sequence encoding the 
wheat cytochrome P450 CYP73A17 

2 0 Contrary to the situation with regard to 

Helianthus tuberosus CYP73A1 , which can be expressed in 
yeasts (Urban et al . , 1994), repeated attempts to 
express wheat CYP73A17 in yeasts using the same 
customary techniques proved to be fruitless when the 

25 nucleotide sequence was not altered at the time it was 
inserted into the expression vector (verification by 
sequencing) . No protein is detected by 
spectrophotometry or by immunoblott ing, just as no 
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enzymic activity is detectable in the microsomes of 
transformed and induced yeast. 

1 . Alteration of the coding sequence 

The sequence encoding wheat CYP73A17 (SEQ. ID 
5 No. 1) was therefore altered, in three different ways, 
by PCR- induced mutagenesis, as follows: 

The BamHI and EcoRI restriction sites were 
respectively introduced by PCR just upstream of the ATG 
codon and just downstream of the stop codon of the 
10 CYP73A17 coding sequence (source, origin) using the 
sense and reverse primers described below, with the 
restriction sites being BamHI in the case of the sense 
primers Reel (SEQ ID No. 3), Rec2 (SEQ ID No. 4) and 
Rec3 (SEQ ID No. 5) , and EcoRI in the case of the 
15 reverse primer (SEQ ID No. 6) . 

A primer, represented by SEQ ID No. 2, was 
also employed for enabling yeasts to be transformed 
with the unmodified (native) sequence encoding wheat 
CYP73A17. 

2 0 The five primers described above were 

obtained from Eurogentech, and were synthesized and 
purified in accordance with customary methods. 

For each alteration using the four different 
sense primers, the mode of operation is as follows: 

25 The reaction mixture (20 mM Tris-HCl, pH 

8.75, 10 mM KCl , 10 mM (NH 4 ) 2 S0 4 , 2 mM MgS0 4 , 0.1% Triton 

X100, 0.1 rag/ml BSA, 5% (v/v) DMSO, 300 dNTP, 

20 pmoles of each primer, 150 ng of template, total 
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volume 50 fj.1) is preheated at 94°C for 2 minutes before 
adding 5 units of Pfu DNA polymerase (Stratagene) . 
After 2 minutes at 94°C, 30 amplification cycles are 
carried out as follows: 1 minute of denaturation at 
5 94 °C, 2 minutes of hybridization at 55°C, 2 minutes of 
extension at 72 °C. The reaction is completed by 
10 minutes of extension at 72 °C. 

For each primer, a sequence is obtained which 
is derived from sequence ID No. 1, and which is 
10 represented, in the case of the altered coding 

sequences, by the sequences ID No. 7, No. 8 and No. 9. 
The 5' ends of the sequences obtained using the four 
abovementioned sense primers are depicted below, with 
the BamHI restriction site being shown in italics: 
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native : 


ATATATGGATCC ATG 


GAC 


GTC 


CTC 


CTC 


CTG 


GAG 


AAG GCC 


Rec 1 


ATATATGGATCC ATG 


GAT 


GTT 


TTG 


TTG 


TTG 


GAG 


AAG GCC 


Rec 2 


ATATATGGATCC ATG 


GAT 


GTT 


TTG 


TTG 


TTG 


GAA 


AAA GCT 


Rec 3 


ATATATGGATCC ATG 


GAT 


GTT 


TTG 


TTG 


TTG 


GAA 


AAA GCT 


Protein: 


met 


asp 


val 


leu 


leu 


leu 


glu 


lys ala 



CTC 


CTG 


GGC 


CTC 


TTC 


GCC 


GCG 


GCG 


GTG 


CTG 


GCC 


ATC 


GCC 


GTC 


GCC 


CTC 


CTG 


GGC 


CTC 


TTC 


GCC 


GCG 


GCG 


GTG 


CTG 


GCC 


ATC 


GCC 


GTC 


GCC 


TTG 


TTG 


GGT 


TTG 


TTC 


GCC 


GCG 


GCG 


GTG 


CTG 


GCC 


ATC 


GCC 


GTC 


GCC 


TTG 


TTG 


GGT 


TTG 


TTT 


GCT 


GCT 


GCT 


GTT 


TTG 


GCT 


ATT 


GCT 


GTT 


GCT 


leu 


leu 


gly 


leu phe 


ala 


ala 


ala 


val 


leu 


ala 


ile 


ala 


val 


ala 



15 



AAG CTC ACC GGC AAG CGC TTC 
AAG CTC ACC GGC AAG CGC TTC 
AAG CTC ACC GGC AAG CGC TTC 
AAA TTG ACT GGT AAA AGA TTT 
lys leu thr gly lys arg phe 



CGC CTC CCC CCT GGC CCC TCC GGC 
CGC CTC CCC CCT GGC CCC TCC GGC 
CGC CTC CCC CCT GGC CCC TCC GGC 
AGA TTG CCA CCA GGT CCA TCC GGC 
arg leu pro pro gly pro ser gly 



GCC CCC ATC GTC 
GCC CCC ATC GTC 
GCC CCC ATC GTC 
GCC CCC ATC GTC 
ala pro ile val 



2 . Transforming the yeasts 

After having been digested with the 
5 restriction enzymes BamHI and EcoRI, the four above - 

described altered coding sequences are integrated into 
the vector pYeDP60, which is described by Pompon et al . 
(Methods Enzymol, 272, 1996, 51-64; WO 97/10344), the 
content of which is hereby incorporated by reference 

10 with regard to the plasmid, the method of insertion 
into the plasmid, and the method of transforming and 
growing the yeasts, in particular using the 
Saccharomyces cerevisiae yeast strains W(R), WAT 2 1 and 
WAT11. The method for transforming and growing yeasts 

15 is also described by Pompon et al. and by Urban et al . 
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{Eur. J. Biochem, 222, 1994, page 844, 2nd column, 
"Yeast transformation and cell culture"). 

4 transformed yeast strains, designated: 
W73A17 (native) , W73A17 (Reel ) , W73A17(Rec2) and 
5 W73A17 (Rec3) , are obtained. 

Example 3 : Expression of CYP73A17 in the altered yeasts 

The previously obtained transformed yeasts 
are cultured, in accordance with the method described 
by Urban et al . (Bur. J". Biochem. , 222, 1994, page 844, 

10 2nd column, "Yeast transformation and cell culture"), 
in 50 ml of SGI medium at 30°C for 72 h. The cells are 
recovered by centrifuging at 8000 g for 10 minutes, 
washed with 25 ml of YPI medium, recentrif uged, and 
then resuspended in 250 ml of YPI medium. The cells are 

15 induced with galactose for 14-16 h, while being shaken 

at 160 rpm, until the cell density reaches 10 s cells per 
ml. The microsomes are then prepared using the method 
described by Pierrel et al . (Eur. J. Biochem. , 224, 
1994, 835-844) . 

2 0 The expression of CYP73A17 achieved in the 

case of the four strains is quantified by differential 
spectrophotometry using the method described by Omura 
and Sato (J. Biol. Chem. , 177, 678-693). It is 
proportional to the number of poorly suited codons 

25 which have been altered. 

The microsomal enzymic activity is measured 
using the method described by Durst F., Benveniste I., 
Schalk M. and Werck-Reichhart D. (1996) Cinnamic acid 
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hydroxylase activity in plant microsomes. Methods 
Enzymol . 272, 259-268. The results obtained after 
transforming WAT21 are recorded in the Table below. The 
activity is expressed as cinnamate 4-hydroxylase 
5 activity. The percentage additional activity (rounded 
values) illustrates the extent of the leap in activity 
which is observed after the poorly suited codons have 
been altered. 



Strain 


Activity pmol/min//xg of 
protein 


% additional 
activity 


W73A17 
native 


0 . 64 




W73A17 Reel 


2 . 84 


+ 340 


W73A17 Rec2 


4 . 92 


+ 670 


W73A17 Rec3 


8 . 90 


+ 1300 



15 These results relating to the increase in 

enzymic activity confirm those relating to the increase 
in the expression of the protein in the yeasts. They 
demonstrate that alteration of the 5' end alone, even 
when limited (Reel) , is sufficient to obtain a very 

2 0 substantial improvement in the production of the enzyme 
by the yeast and in its enzymic activity. 
Example 4: Expression of wheat CYP86A5 in the altered 
yeasts 

The sequence encoding wheat cytochrome P4 5 0 
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CYP86A5 , which is depicted by sequence identifier No. 
10 (SEQ ID No. 10) , was isolated from the wheat cDNA 
library described in Example 1 using the same method of 
operation as described for the CYP73A17 sequence and 
5 employing the complete coding sequence of Arabidopsis 
thaliana CYP86A1 as the probe. This wheat CYP86A5 
sequence was altered, in accordance with the mode of 
operation of Example 2, using the two oligonucleotides 
depicted by the sequences ID No. 12 and 13 (SEQ ID 
10 No. 12 and SEQ ID No. 13) as sense and reverse primers, 
respectively, in order to obtain the coding sequence 
which is altered in accordance with the invention and 
which is depicted by sequence identifier No. 14 (SEQ ID 
No. 14) . 

15 A primer depicted by SEQ ID No. 11 was also 

used to enable yeasts to be transformed with the 
sequence encoding unmodified (native) wheat CYP86A5. 

The yeasts are transformed with this new 
coding sequence and the expression is quantified by 

20 differential spectrophotometry in accordance with the 
mode of operation described in Example 2. While the 
natural sequence of wheat CYP86A5 is not expressed in a 
detectable manner, there is substantial expression in 
the transformed yeasts of the sequence which has been 

25 modified in accordance with the invention. 

The above -de scribed examples demonstrate 
unambiguously that the expression in yeasts of DNA 
sequences which possess a 5' region having a high 
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content of codons which are poorly suited to yeasts is 
substantially improved when this region alone is simply 
recoded in accordance with the invention, ever 
partially, with corresponding codons which are well- 
5 suited to yeasts. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(iii) NUMBER OF SEQUENCES: 14 

(2) INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2261 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 49.. 1551 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 
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CTC CTC CTG GAG AAG GCC CTC CTG GGC CTC TTC GCC GCG GCG GTG CTG 
Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala Ala Val Leu 
5 10 15 

GCC ATC GCC GTC GCC AAG CTC ACC GGC AAG CGC TTC CGC CTC CCC CCT 
Ala lie Ala Val Ala Lys Leu Thr Gly Lys Arg Pha Arg Leu Pro Pro 
20 25 30 35 

GGC CCC TCC GGC GCC CCC ATC GTC GGC AAC TGG CTG CAG GTC GGC GAC 
Gly Pro Ser Gly Ala Pro lie Val Gly Asn Trp Leu Gin Val Gly Asp 
40 45 50 

GAC CTC AAC CAC CGC AAC CTG ATG GGC CTG GCC AAG CGG TTC GGC GAG 
Asp Leu Asn His Arg Asn Leu Mat Gly Leu Ala Lys Arg Phe Gly Glu 
55 SO 65 

GTG TTC CTC CTC CGC ATG GGC GTC CGC AAC CTG GTG GTC GTC TCC AGC 
Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val Val S«r Ser 
70 75 80 

CCC GAG CTC GCC AAG GAG GTC CTC CAC ACC CAG GGC GTC GAG TTC GGC 
Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val Glu Phe Gly 
85 90 95 

TCC CGC ACC CGC AAC GTC GTC TTC GAC ATC TTC ACC GGC AAG GGA CAG 
Ser Arg Thr Arg Asn Val Val Phe Asp He Phe Thr Gly Lys Gly Gin 
100 105 HO H5 

GAC ATG GTG TTC ACG GTG TAC GGC GAC CAC TGG CGC AAG ATG COG CGG 
Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys Met Arg Arg 

120 125 130 

ATC ATG ACG GTG CCC TTC TTC ACC AAC AAG GTG GTG GCG CAG AAC CGC 
He Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Ala Gin Asn Arg 
135 140 1*5 

GTG GGG TGG GAG GAG GAG GCC CGG CTG GTG GTG GAG GAC CTC AAG GCC 
Val Gly Trp Glu Glu Glu Ala Arg Leu Val Val Glu Asp Leu Lys Ala 
150 155 160 
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GAC CCG GCG GCG GCG ACG GCG GGC GTG GTG GTC CGC CGC AGG CTG CAG 
Asp Pro Ala Ala Ala Thr Ala Gly Val Val Val Arg Arg Arg Leu Gin 
165 170 175 

CTC ATG ATG TAC AAC GAC ATG TTC CGC ATC ATG TTC GAC CGC CGG TTC 
Leu Met Met Tyr Asn Asp Met Phe Arg lie Met Phe Asp Arg Arg Phe 
180 185 190 195 

GAC AGC GTG GCC GAC CCG CTC TTC AAC CAG CTC.AAG GCG CTC AAC GCC 
Glu Ser Val Ala Asp Pro Leu Phe Asn Gin Leu Lys Ala Leu Asn Ala 
200 205 210 

GAG CGC AGC ATC CTC TCC CAG AGC TTC GAC TAC AAC TAC GGC GAC TTC 
Glu Arg Ser He Leu Ser Gin Ser Phe Asp Tyr Asn Tyr Gly Asp Phe 
215 220 225 

ATC CCC GTC CTC CGC CCC TTC CTC CGC CGC TAC CTC AAC CGC TGC ACC 
He Pro Val Leu Arg Pro Phe Leu Arg Arg Tyr Leu Asn Arg Cys Thr 
230 235 240 

AAC CTC AAG ACC AAG CGG ATG AAG GTG TTC GAG GAC CAC TTC GTC CAG 
Asn Leu Lys Thr Lys Arg Met Lys Val Phe Glu Asp His Phe Val Gin 
245 250 255 

CAG CGC AAG GAG GCG TTG GAG AAG ACG GGT GAG ATC AGG TGC GCC ATG 
Gin Arg Lys Glu Ala Leu Glu Lys Thr Gly Glu He Arg Cy« Ala Met 

260 265 270 27S 

GAC CAC ATC CTG GAA GCC GAA AGG AAG GGC GAG ATC AAC CAC GAC AAC 
Asp His He Leu Glu Ala Glu Arg Lys Gly Glu He Asn His Asp Asn 
280 285 290 

GTC CTC TAC ATC GTC GAG AAC ATC AAC GTC GCA GCC ATC GAG ACG ACG 
Val Leu Tyr He Val Glu Asn He Asn Val Ala Ala lie Glu Thr Thr 
295 300 Joa 

CTG TGG TCG ATC GAG TGG GGC CTC GCG GAG CTG GTG AAC CAC CCG GAG 
Leu Trp Ser He Glu Trp Gly Leu Ala Glu Leu Val Asn His Pro Glu 

310 315 320 

ATC CAG CAG AAG CTG CGC GAG GAG ATC GTC GCC GTT CTG GGC GCC GGC 
He Gin Gin Lys Leu Arg Glu Glu He Val Ala Val Leu Gly Ala Gly 
325 330 335 

GTG GCG GTG ACG GAG CCG GAC CTG GAG CGC CTC CCC TAC CTG CAG TCC 
Val Ala Val Thr Glu Pro Asp Leu Glu Arg Leu Pro Tyr Leu Gin Ser 
340 345 350 355 

GTG GTG AAG GAG ACG CTC CGC CTC CGC ATG GCA ATC CCG CTC CTG GTG 
Val Val Lys Glu Thr Leu Arg Leu Arg Met Ala He Pro Leu Leu Val 

360 365 370 

CCG CAC ATG AAC CTC AGC GAC GCC AAG CTC GCC GGC TAC GAC ATC CCC 
Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr Asp He Pro 
375 380 385 

GCC GAG TCC AAG ATC CTC GTC AAC GCC TGG TTC CTC GCC AAC GAC CCC 
Ala Glu Ser Lys He Leu Val Asn Ala Trp Phe Leu Ala Asn Asp Pro 
390 395 400 

AAG CGG TGG GTG CGC GCC GAT GAG TTC AGG CCG GAG AGG TTC CTC GAG 
Lys Arg Trp Val Arg Ala Asp Glu Phe Arg Pro Glu Arg Phe Leu Glu 

40S 410 415 

GAG GAG AAG GCC GTC GAG GCC CAC GGC AAC GAT TTC CGG CTC GTC CCC 
Glu Glu Lys Ala Val Glu Ala His Gly Asn Asp Phe Arg Phe Val Pro 
420 425 430 435 
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TTC GGC GTC GGC CGC CGG AGC TGC CCC GGG ATC ATC CTC GCG CTG CCC 1401 
Phe Gly Val Gly Arg Arg S«r Cys Pro Gly lie lie Leu Ala Leu Pro 
440 445 450 

ATC ATC GGC ATC ACG CTC GGA CGC CTG GTG CAG AAC TTC GAG CTG CTG 1449 
lie lie Gly lie Thr Leu Gly Arg Leu Val Gin Asn Phe Gin Leu Leu 
455 460 465 

CCG CCG CCG GGG CAG GAC AAG ATC GAC ACC ACC GAG AAG CCC GGG CAG 1497 
Pro Pro Pro Gly Gin Asp Lys He Asp Thr Thr Glu Lys Pro Gly Gin 
470 475 480 

TTT ACC AAC CAG ATC CTC AAG CAC GCC ACC ATT GTC TGC AAG CCA CTC 1545 
Phe Thr Asn Gin He Leu Lys His Ala Thr He Val Cys Lys Pro Leu 
485 490 495 

GAG GCT TAACTGAATT GAGGTTTCGG TCATGGGCGC CCGCTGACGC GGGGAGATGG 1601 

Glu Ala 

500 

ATCTATGCAT GTGACTGTGT ATTTTGCCTT C TTT CTTTTT GGT G TT G TT T TTTGCAGTAG 1661 

TAAGTTTAAT TTTTCTTTGG TGTTGGCCTA TTTGTCTTCA TGTGAGGCGT CGTGTTGTAA 1721 

ATTTCCATAT AGTTGGCAAT GTGATGTAAA ACTTGGCTCC AAAAAAAAAA AAAAAAAACT 1781 

CGAGACTCTT CTCTCTCTCT CTCTCTCTCC AGCCTCGGGT CTCTGCTGGC AAGGGAACTT 1841 

GCATTACCCT GTGTACGACG GCGCCATGTT CGTCCCTGAA GCACCCTCCC TGCAGAGCTC 1901 

CCAGGAGAAC TTCGCTGCAT CTGCTGGTTT CAAGCGTCGA AGGAGAGAGT TTTGAATACC 1961 

CGAAAGAATA TAGCGTTGGA CATATCTGTC AAACAGGGGA TCTTGCTGTG GGTCTCTTGG 2021 

TGGGCCAAAT CGCATAGACA ATCATTCAAA TGGATGGGTT CTTCGCTGGT CGGTCAAAAA 2081 

GTATATGTTG TAATTGTACG CCTTTTTTGG GTCTTGTTGC CAAAGATCAT GGTTATTGAG 2141 

TTGTGAGCTC TGAGATAACA GGTTTGTGTA TAGTGAAATA AAGAGGAGCG TCGTCAACAC 2201 

CATGTACTAT ATAGGCTTTG AAATTCCATT AAGATGCATC AGAAATCAAT GTTGGATTTG 2261 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
ATATATGGAT CCATGGACGT CCTCCTCCTG GAGAAGGC 38 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATATATGGAT CCATGGATGT TTTGTT G TTG GAGAAGGCCC TCCTGGGCCT CTTCGC 56 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
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(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ATATATGGAT CCATGGATGT TTTGTTGTTG GAAAAAGCTT TGTTGGGTTT GTTCGCCGCG 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATATATGGAT CCATGGATGT TTTGTTGTTG GAAAAAGCTT TGTTGGGTTT GTTTGCTGCT 60 
GCTGTTTTGG CTATTGCTGT TGCTAAATTG ACTGGTAAAA GATTTAGATT GCCACCAGGT 120 
CCATCCGGCG CCCCCATCGT CGG 143 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: 
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TATATAGAAT TCCAGTTAAG CCTCGAGTGG CTTGCAGAC 39 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1506 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1503 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
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ATG GAT GTT TTG TTG TTG GAG AAG GCC CTC CTG" GGC CTC TTC GCC GCG 48 
Met Asp Val Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala 
IS 10 15 

GCG GTG CTG GCC ATC GCC GTC GCC AAG CTC ACC GGC AAG CGC TTC CGC 96 
Ala Val Leu Ala He Ala Val Ala Lys Leu Thr Gly Lys Arg Phe Arg 
20 25 30 

CTC CCC CCT GGC CCC TCC GGC GCC CCC ATC GTC GGC AAC TGG CTG CAG 144 
Leu Pro Pro Gly Pro Ser Gly Ala Pro He Val Gly Asn Trp Leu Gin 
35 40 45 

GTC GGC GAC GAC CTC AAC CAC CGC AAC CTG ATG GGC CTG GCC AAG CGG 192 

Val Gly Asp Asp Leu Asn His Arg Asn Leu Met Gly Leu Ala Lys Arg 

50 55 60 

TTC GGC GAG GTG TTC CTC CTC CGC ATG GGC GTC CGC AAC CTG GTG GTC 240 
Phe Gly Glu Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val 
65 70 75 80 

GTC TCC AGC CCC GAG CTC GCC AAG GAG GTC CTC CAC ACC CAG GGC GTC 288 
Val Ser Ser Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val 
85 90 95 

GAG TTC GGC TCC CGC ACC CGC AAC GTC GTC TTC GAC ATC TTC ACC GGC 33 6 

Glu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp lie Phe Thr Gly 
100 105 110 

AAG GGA CAG GAC ATG GTG TTC ACG GTG TAC GGC GAC CAC TGG CGC AAG 384 
Lys Gly Gin Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys 
115 120 125 

ATG CGG CGG ATC ATG ACG GTG CCC TTC TTC ACC AAC AAG GTG GTG GCG 432 
Met Arg Arg lie Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Ala 
130 135 140 

CAG AAC CGC GTG GGG TGG GAG GAG GAG GCC CGG CTG GTG GTG GAG GAC 480 
Gin Asn Arg Val Gly Trp Glu Glu Glu Ala Arg Leu Val Val Glu Asp 
145 150 155 160 

CTC AAG GCC GAC CCG GCG GCG GCG ACG GCG GGC GTG GTG GTC CGC CGC 52 8 

L«u Lys Ala Asp Pro Ala Ala Ala Thx Ala Gly Val Val Val Arg Arg 
• 165 170 175 

AGG CTG CAG CTC ATG ATG TAC AAC GAC ATG TTC CGC ATC ATG TTC GAC 576 
Arg Leu Gin Leu Met Met Tyr Asn Asp Met Phe Arg He Met Phe Asp 
180 185 190 

CGC CGG TTC GAG AGC GTG GCC GAC CCG CTC TTC AAC CAG CTC AAG GCG 624 
Arg Arg Phe Glu Ser Val Ala Asp Pro Leu Phe Asn Gin Leu Lys Ala 
195 200 205 

CTC AAC GCC GAG CGC AGC ATC CTC TCC CAG AGC TTC GAC TAC AAC TAC 672 
Leu Asn Ala Glu Arg Ser He Leu Ser Gin Sex Phe Asp Tyr Asn Tyr 
210 215 220 

GGC GAC TTC ATC CCC GTC CTC CGC CCC TTC CTC CGC CGC TAC CTC AAC 720 
Gly Asp Phe lie Pro Val Leu Arg Pro Phe Leu Arg Arg Tyr Leu Asn 
225 230 235 240 
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CGC TGC ACC AAC CTC AAG ACC AAG CGG ATG AAG GTG TTC GAG GAC CAC 
Arg Cys Thr Asn Leu Lys Thr Lys Arg Mee Lys Val Phe Glu Asp His 
245 250 255 

TTC GTC CAG CAG CGC AAG GAG GCG TTG GAG AAG ACG GGT GAG ATC AGG 
Phe Val Gin Gin Arg Lys Glu Ala Leu Glu Lys Thr Gly Glu lie Arg 
260 265 270 

TGC GCC ATG GAC CAC ATC CTG GAA GCC GAA AGG AAG GGC GAG ATC AAC 
Cys Ala Met Asp His lie Leu Glu Ala Glu Arg- Lys Gly Glu lie Asn 
275 280 285 

CAC GAC AAC GTC CTC TAC ATC GTC GAG AAC ATC AAC GTC GCA GCC ATC 
His Asp Asn Val Leu Tyr lie Val Glu Asn He Asn Val Ala Ala He 

290 295 300 

GAG ACG ACG CTG TGG TCG ATC GAG TGG GGC CTC GCG GAG CTG GTG AAC 
Glu Thr Thr Leu Trp Ser Zle Glu Trp Gly Leu Ala Glu Leu Val Asn 
305 310 315 320 

CAC CCG GAG ATC CAG CAG AAG CTG CGC GAG GAG ATC GTC GCC GTT CTG 
His Pro Glu He Gin Gin Lys Leu Arg Glu Glu lie Val Ala Val Leu 
325 330 335 

GGC GCC GGC GTG GCG GTG ACG GAG CCG GAC CTG GAG CGC CTC CCC TAC 

Gly Ala Gly Val Ala Val Thr Glu Pro Asp Leu Glu Arg Leu Pro Tyr 
340 345 350 

CTG CAG TCC GTG GTG AAG GAG ACG CTC CGC CTC CGC ATG GCA ATC CCG 
Leu Gin Ser Val Val Lys Glu Thr Leu Arg Leu Arg Mec Ala lie Pro 
355 360 365 

CTC CTG GTG CCG CAC ATG AAC CTC AGC GAC GCC AAG CTC GCC GGC TAC 
Leu Leu Val Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr 
370 375 380 

GAC ATC CCC GCC GAG TCC AAG ATC CTC GTC AAC GCC TGG TTC CTC GCC 
Asp He Pro Ala Glu Ser Lys He Leu Val Asn Ala Trp Phe Leu Ala 
385 390 395 400 

AAC GAC CCC AAG CGG TGG GTG CGC GCC GAT GAG TTC AGG CCG GAG AGG 

Asn Asp Pro Lys Arg Trp Val Arg Ala Asp Glu Phe Arg Pro Glu Arg 
40S 410 415 

TTC CTC GAG GAG GAG AAG GCC GTC GAS GCC CAC GGC AAC GAT TTC CGG 
Phe Leu Glu Glu Glu Lys Ala Val Glu Ala His Gly Asn Asp Phe Arg 
420 423 430 

TTC GTG CCC TTC GGC GTC GGC CGC CGG AGC TGC CCC GGG ATC ATC CTC 
Phe Val Pro Phe Gly Val Gly Arg Arg Ser Cys Pro Gly tie He Leu 
435 440 445 

GCG CTG CCC ATC ATC GGC ATC ACG CTC GGA CGC CTG GTG CAG AAC TTC 
Ala Leu Pro He He Gly He Thr Leu Gly Arg Leu Val Gin Asn Phe 
4S0 455 460 

CAG CTG CTG CCG CCG CCG GGG CAG GAC AAG ATC GAC ACC ACC GAG AAG 
Gin Leu Leu Pro Pro Pro Gly Gin Asp Lys lie Asp Thr Thr Glu Lys 
465 470 475 480 

CCC GGG CAG TTT ACC AAC CAG ATC CTC AAG CAC GCC ACC ATT GTC TGC 
Pro Gly Gin Phe Thr Asn Gin tie Leu Lys His Ala Thr tie Val Cys 
485 490 495 

AAG CCA CTC GAG GCT TAA 
Lys Pro Leu Glu Ala 
500 



29 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1506 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1. .1503 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 
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ATG GAT GTT TTG TTG TTG GAA AAA GCT TTG TTG GGT TTG TTC GCC GCG 48 
Mat Asp Val Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala 

15 10 15 

GCG GTG CTG GCC ATC GCC GTC GCC AAG CTC ACC GGC AAG CGC TTC CGC 96 
Ala Val Leu Ala lie Ala Val Ala Lys Leu Thr Gly Lys Arg Phe Arg 
20 25 30 

CTC CCC CCT GGC CCC TCC GGC GCC CCC ATC GTC GGC AAC TGG CTG CAG 144 
Leu Pro Pro Gly Pro Ser Gly Ala Pro He Val Gly Ami Trp Leu Gin 

35 40 45 

GTC GGC GAC GAC CTC AAC CAC CGC AAC CTG ATG GGC CTG GCC AAG CGG 192 
Val Gly Asp Asp Leu Asn His Arg Asn Leu Met Gly Leu Ala Lys Arg 
50 55 60 

TTC GGC GAG GTG TTC CTC CTC CGC ATG GGC GTC CGC AAC CTG GTG GTC 240 
Phe Gly Glu Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val 
65 70 75 80 

GTC TCC AGC CCC GAG CTC GCC AAG GAG GTC CTC CAC ACC CAG GGC GTC 288 
Val Ser Ser Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val 
85 90 95 

GAG TTC GGC TCC CGC ACC CGC AAC GTC GTC TTC GAC ATC TTC ACC GGC 336 
Glu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp lie Phe Thr Gly 
100 105 110 

AAG GGA CAG GAC ATG GTG TTC ACG GTG TAC GGC GAC CAC TGG CGC AAG 384 
Lys Gly Gin Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys 
115 120 125 

ATG CGG CGG ATC ATG ACG GTG CCC TTC TTC ACC AAC AAG GTG GTG GCG 432 
Met Arg Arg He Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Ala 
130 135 140 

CAG AAC CGC GTG GGG TGG GAG GAG GAG GCC CGG CTG GTG GTG GAG GAC 480 
Gin Asn Arg Val Gly Trp Glu Glu Glu Ala Arg Leu Val Val Glu Asp 
145 ISO 155 160 

CTC AAG GCC GAC CCG GCG GCG GCG ACG GCG GGC GTG GTG GTC CGC CGC 528 
Leu Lys Ala Asp Pro Ala Ala Ala Thr Ala Gly Val Val Val Arg Arg 
165 170 175 

AGG CTG CAG CTC ATG ATG TAC AAC GAC ATG TTC CGC ATC ATG TTC GAC 576 
Arg Leu Gin Leu Met Met Tyr Asn Asp Mac Phe Arg He Met Phe Asp 
180 185 190 

CGC CGG TTC GAG AGC GTG GCC GAC CCG CTC TTC AAC CAG CTC AAG GCG 624 
Arg Arg Phe Glu Ser Val Ala Asp Pro Leu Phe Asn Gin Leu Lys Ala 
195 200 205 
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CTC AAC GCC GAG CGC AGO ATC CTC TCC CAC AGC TTC GAC TAC AAC TAC 672 
Leu Asn Ala Glu Arg Ser He Leu Ser Gin Ser Phe Asp Tyr Asn Tyr 

210 215 220 

GGC GAC TTC ATC CCC GTC CTC CGC CCC TTC CTC CGC CGC TAC CTC AAC 720 
Gly Asp Phe lie Pro Val Leu Arg Pro Phe Leu Arg Arg Tyr Leu Asn 
225 230 235 240 

CGC TGC ACC AAC CTC AAG ACC AAG CGG ATG AAG GTG TTC GAG GAC CAC 768 
Arg Cys Thr Asn Leu Lys Thr Lys Arg Met Lys Val Phe Glu Asp His 
245 250 255 

TTC GTC CAG CAG CGC AAG GAG GCG TTG GAG AAG ACG GGT GAG ATC AGG 816 

Phe Val Gin Gin Axg Lys Glu Ala Leu Glu Lys Thr Gly Glu lie Arg 
260 265 270 

TGC GCC ATG GAC CAC ATC CTG GAA GCC GAA AGG AAG GGC GAG ATC AAC 864 
Cys Ala Hee Asp His 11* Leu Glu Ala Glu Arg Lys Sly Glu He Asn 
275 280 285 

CAC GAC AAC GTC CTC TAC ATC GTC GAG AAC ATC AAC GTC GCA GCC ATC 912 
His Asp Asn Val Leu Tyr He Val Glu Asn He Asn Val Ala Ala lie 
290 295 300 

GAG ACG ACG CTG TGG TCG ATC GAG TGG GGC CTC GCG GAG CTG GTG AAC 960 

Glu Thr Thr Leu Trp Ser He Glu Trp Gly Leu Ala Glu Leu Val Asn 

305 310 315 320 

CAC CCG GAG ATC CAG CAG AAG CTG CGC GAG GAG ATC GTC GCC GTT CTG 1008 
His Pro Glu He Gin Gin Lys Leu Arg Glu Glu He Val Ala Val Leu 
325 330 335 

GGC GCC GGC GTG GCG GTG ACG GAG CCG GAC CTG GAG CGC CTC CCC TAC 1055 
Gly Ala Gly Val Ala Val Thr Glu Pro Asp Leu Glu Arg Leu Pro Tyr 
340 34S 350 

CTG CAG TCC GTG GTG AAG GAG ACG CTC CGC CTC CGC ATG GCA ATC CCG 1104 
Leu Gin Ser Val Val Lys Glu Thr Leu Arg Leu Arg Met Ala He Pro 
355 3S0 365 

CTC CTG GTG CCG CAC ATG AAC CTC AGC GAC GCC AAG CTC GCC GGC TAC 1152 

Leu Leu val Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr 

370 375 380 

GAC ATC CCC GCC GAG TCC AAG ATC CTC GTC AAC GCC TGG TTC CTC GCC 1200 
Asp He Pro Ala Glu Ser Lys He Leu Val Asn Ala Trp Phe Leu Ala 
385 390 395 400 

AAC GAC CCC AAG CGG TGG GTG CGC GCC GAT GAG TTC AGG CCG GAG AGG 1248 
Asn Asp Pro Lys Arg Trp Val Arg Ala Asp Glu Phe Arg Pro Glu Arg 
40S 410 415 

TTC CTC GAG GAG GAG AAG GCC GTC GAG GCC CAC GGC AAC GAT TTC CGG 1296 
Phe Leu Glu Glu Glu Lys Ala Val Glu Ala His Gly Asn Asp Phe Arg 
420 425 430 

TTC GTG CCC TTC GGC GTC GGC CGC CGG AGC TGC CCC GGG ATC ATC CTC 1344 
Phe Val Pro Phe Gly Val Gly Arg Arg Ser Cys Pro Gly He He Leu 
43S 440 445 

GCG CTG CCC ATC ATC GGC ATC ACG CTC GGA CGC CTG GTG CAG AAC TTC 1392 
Ala Leu Pro He He Gly He Thr Leu Gly Arg Leu Val Gin Asn Phe 
450 455 460 

CAG CTG CTG CCG CCG CCG GGG CAG GAC AAG ATC GAC ACC ACC GAG AAG 1440 
Gin Leu Leu Pro Pro Pro Gly Gin Asp Lys He Asp Thr Thr Glu Lys 
465 470 475 480 
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CCC GGG CAG TTT ACC AAC CAG ATC CTC AAG CAC GCC ACC ATT GTC TGC 

Pro Gly Gin Phe Thr Asn Gin He Leu Lys His Ala Thr lie Val Cys 

485 490 49S 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1506 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1503 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 
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ATG SAT GTT TTG TTG TTG GAA AAA GCT TTG TTG GGT TTG TTT GCT GCT 48 

Mac Asp Val Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala 

505 510 515 

GCT GTT TTG GCT ATT GCT GTT GCT AAA TTG ACT GGT AAA AGA TTT AGA 96 
Ala Val Leu Ala He Ala Val Ala Lys Leu Thr Gly Lys Arg Phe Arg 
520 525 530 

TTG CCA CCA GGT CCA TCC GGC GCC CCC ATC GTC GGC AAC TSG CTG CAG 144 
Leu Pro Pro Gly Pro Ser Gly Ala Pro He Val Gly Asn Trp Leu Gin 
35 40 45 

GTC GGC GAC GAC CTC AAC CAC CGC AAC CTG ATG GGC CTG GCC AAG CGG 192 
Val Gly Asp Asp Leu Asn His Arg Asn Leu Met Gly Leu Ala Lys Arg 
50 55 60 

TTC GGC GAG GTG TTC CTC CTC CGC ATG GGC GTC CGC AAC CTG GTG GTC 240 
Phe Gly Glu Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val 
65 70 75 80 

GTC TCC AGC CCC GAG CTC GCC AAG GAG GTC CTC CAC ACC CAG GGC GTC 288 
Val Ser Ser Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val 
85 90 95 

GAG TTC GGC TCC CGC ACC CGC AAC GTC GTC TTC GAC ATC TTC ACC GGC 336 
Glu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp He Phe Thr Gly 

100 105 110 

AAG GGA CAG GAC ATG GTG TTC ACG GTG TAC GGC GAC CAC TOG CGC AAG 384 

Lys Gly Gin Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys 

115 120 125 

ATG CGG CGG ATC ATG ACG GTG CCC TTC TTC ACC AAC AAG GTG GTG GCG 432 
Met Arg Arg He Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Ala 

13 0 135 140 

CAG AAC CGC GTG GGG TGG GAG GAG GAG GCC CGG CTG GTG GTG GAG GAC 480 
Gin Asn Arg Val Gly Trp Glu Glu Glu Ala Arg Leu Val Val Glu Asp 
14S ISO 155 160 

CTC AAG GCC GAC CCG GCG GCG GCG ACG GCG GGC GTG GTG GTC CGC CGC 528 
Leu Lys Ala Asp Pro Ala Ala Ala Thr Ala Gly Val Val Val Arg Arg 

165 170 175 
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AGG CTC CAG CTC ATG ATG TAC AAC GAC ATG TTC CGC ATC ATG TTC GAC 
Arg Leu Gin Leu Met Met Tyr Asn Asp Met Phe Arg lie Met Phe Asp 

180 185 190 

CGC CGG TTC GAG AGC GTG GCC GAC CCG CTC TTC AAC CAG CTC AAG GCG 
Arg Arg Phe Glu Ser Val Ala Asp Pro Leu Phe Asn Gin Leu Lys Ala 
195 200 205 

CTC AAC GCC GAG CGC AGC ATC CTC TCC CAG AGC TTC GAC TAC AAC TAC 

Leu Asn Ala Glu Arg Ser lie Leu Ser Gin Ser- Phe Asp Tyr Asn Tyr 

210 215 220 

GCC GAC TTC ATC CCC GTC CTC CGC CCC TTC CTC CGC CGC TAC CTC AAC 

Gly Asp Phe lie Pro Val Leu Arg Pro Phe Leu Arg Arg Tyr Leu Asn 
225 230 235 240 

CGC TGC ACC AAC CTC AAG ACC AAG CGG ATG AAG GTG TTC GAG GAC CAC 
Arg Cys Thr Asn Leu Lys Thr Lys Arg Met Lys Val Phe Glu Asp His 
245 2S0 255 

TTC GTC CAG CAG CGC AAG GAG GCG TTG GAG AAG ACG GGT GAG ATC AGG 
Phe Val Gin Gin Arg Lys Glu Ala Leu Glu Lys Thr Gly Glu lie Arg 
260 265 270 

TGC GCC ATG GAC CAC ATC CTG GAA GCC GAA AGG AAG GGC GAG ATC AAC 
Cys Ala Met Asp His lie Leu Glu Ala Glu Arg Lys Gly Glu lie Asn 
275 280 235 

CAC GAC AAC GTC CTC TAC ATC GTC GAG AAC ATC AAC GTC GCA GCC ATC 
His Asp Asn Val Leu Tyr He Val Glu Asn Xle Asn Val Ala Ala lie 
290 295 300 

GAG ACG ACG CTG TGG TCG ATC GAG TGG GGC CTC GCG GAG CTG GTG AAC 
Glu Thr Thr Leu Trp Ser He Glu Trp Gly Leu Ala Glu Leu Val Asn 
305 310 315 320 

CAC CCG GAG ATC CAG CAG AAG CTG CGC GAG GAG ATC GTC GCC GTT CTG 
His Pro Glu lie Gin Gin Lys Leu Arg Glu Glu He Val Ala Val Leu 
325 330 335 

GGC GCC GGC GTG GCG GTG ACG GAG CCG GAC CTG GAG CGC CTC CCC TAC 
Gly Ala Gly Val Ala Val Thr Glu Pro Asp Leu Glu Arg Leu Pro Tyr 
340 345 350 

CTG CAG TCC GTG GTG AAG GAG ACG CTC CGC CTC CGC ATG GCA ATC CCG 
Leu Gin Ser Val Val Lys Glu Thr Leu Arg Leu Arg Met Ala lie Pro 
355 360 365 

CTC CTG GTG CCG CAC ATG AAC CTC AGC GAC GCC AAG CTC GCC GGC TAC 
Leu Leu Val Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr 
370 375 380 

GAC ATC CCC GCC GAG TCC AAG ATC CTC GTC AAC GCC TGG TTC CTC GCC 
Asp He Pro Ala Glu Ser Lys He Leu Val Asn Ala Trp Phe Leu Ala 
385 390 395 400 

AAC GAC CCC AAG CGG TGG GTG CGC GCC GAT GAG TTC AGG CCG GAG AGG 
Asn Asp Pro Lys Arg Trp Val Arg Ala Asp Glu Phe Arg Pro Glu Arg 
405 410 415 

TTC CTC GAG GAG GAG AAG GCC GTC GAG GCC CAC GGC AAC GAT TTC CGG 
Phe Leu Glu Glu Glu Lys Ala Val Glu Ala His Gly Asn Asp Phe Arg 
420 425 *30 

TTC GTG CCC TTC GGC GTC GGC CGC CGG AGC TGC CCC GGG ATC ATC CTC 
Phe Val Pro Phe Gly Val Gly Arg Arg Ser Cys Pro Gly He He Leu 
435 440 445 



35 



GCG CTG CCC ATC ATC GGC ATC ACG CTC GGA CGC CTG GTG CAG AAC TTC 
Ala Leu Pro lie He Gly He Thr Leu Gly Arg Leu Val Gin Asn Phe 
450 455 460 

CAG CTG CTG CCG CCG CCG GGG CAG GAC AAG ATC GAC ACC ACC GAG AAG 
Gin Leu Leu Pro Pro Pro Gly Gin Asp Lys He Asp Thr Thr Glu Lys 
465 470 475 480 

CCC GGG CAG TTT ACC AAC CAG ATC CTC AAG CAC GCC ACC ATT GTC TGC 
Pro Gly Gin Phe Thr Asn Gin He Leu Lys His Ala Thr He Val Cys 
485 490 495 

AAG CCA CTC GAG GCT TAA 
Lys Pro Leu Glu Ala 
500 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2181 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

( ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 112.. 1734 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
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CGATCCACCC CTTGGATCCA CTCTACCCAG CTCGCTAGCC AGCGGGGTAC ATACACGCAC 



GTG GGG ACG TGG GCG GTG GTG GTG TCG GCG GTG GCC GCG TAC ATG GCG 
Val Gly Thr Trp Ala Val Val Val Ser Ala Val A1a Ala Tyr Met Ala 
5 10 15 

TGG TTC TGG CGG ATG TCC CGC GGG CTG CGC GGG CCG CGG GTT TGG CCC 
Trp Ph« Trp Arg Met Sax Arg Gly Leu Arg Gly Pro Arg Val Trp Pro 
20 25 30 

GTG CTC GGC AGC CTG CCG GGC CTG GTG CAG CAC GCC GAG GAC ATG CAC 

Val L«u Gly Ser Leu Pro Gly Leu Val Gin His Ala Glu Asp Met His 
35 40 45 SO 

GAG TGG ATC GCC. GGC AAC CTG CGC CGC GCG GGC GGC ACG TAC CAG ACC 

Glu Trp lie Ala Gly Asn Leu Arg Arg Ala Gly Gly Thr Tyr Gin Thr 
55 60 65 

TGC ATC TTC GCC GTG CCC GGG GTG GCG CGC CGC GGC GGC CTG GTC ACC 
Cys He Phe Ala Val Pro Gly Val Ala Arg Arg Gly Gly Leu Val Thr 
70 75 80 

GTC ACC TGC GAC CCG CGC AAC CTG GAG CAC GTC CTG AAG GCG CGC TTC 
Val Thr Cys Asp Pro Arg Asn Leu Glu His Val Leu Ly» Ala Arg Phe 
85 90 9S 

GAC AAC TAC CCC AAG GGC CCC TTC TGG CAC GGC GTC TTC CGG GAC CTG 
Asp Asn Tyr Pro Lys Gly Pro Phe Trp His Gly Val Phe Arg Asp Leu 

100 105 HO 



CTC GGC GAC GGC ATC TTC AAT TCC GAC GGC SAC ACC TGG CTC GCG CAG 

Leu Gly Asp Gly lie Phe Asn Ser Asp Gly Asp Thr Tip Leu Ala Gin 
115 120 125 130 

CGC AAG ACG GCC GCG CTC GAG TTC ACC ACC CGC ACG CTC CGG ACG GCC 
Arg Lys Thr Ala Ala Leu Glu Phe Thr Thr Arg Thr Leu Arg Thr Ala 
135 140 145 

ATG TCC CGC TGG GTC TCG CGC TCC ATC CAC GGC CGC CTC CTG CCC ATC 
Met Ser Arg Trp Val Ser Arg Ser lie His Gly Arg Leu Leu Pro lie 
ISO 155 160 

CTG GCC GAC GCG GCC AAG GGC AAG GCG CAG GTG GAT CTC CAG GAC CTC 
Leu Ala Asp Ala Ala Lys Gly Lys Ala Gin val Asp Leu Gin Asp Leu 
165 170 175 

CTC CTC CGC CTC ACC TTC GAC AAC ATC TGC GGC CTG GCC TTC GGC AAG 
Leu Leu Arg Leu Thr Phe Asp Asn He Cys Gly Leu Ala Phe Gly Lys 
180 185 190 

GAC CCG GAG ACG CTC GCC CAG GGC CTG CCG GAG AAC GAG TTC GCC TCC 
Asp Pro Glu Thr Leu Ala Gin Gly Leu Pro Glu Asn Glu Phe Ala Ser 
195 200 205 210 

GCG TTC GAC CGC GCC ACC GAG GCC ACG CTC AAC CGC TTC ATC TTC CCG 
Ala Phe Asp Arg Ala Thr Glu Ala Thr Leu Asn Arg Phe lie Phe Pro 
215 220 225 

GAG TTC CTG TGG CGC TGC AAA AAG TGG CTG GGC CTC GGC ATG GAG ACC 

Glu Phe Leu Trp Arg Cys Lys Lys Trp Leu Gly Leu Gly Met Glu Thr 
230 235 240 

ACG CTG ACC AGC AGC ATG GCC CAC GTC GAC CAG TAC CTC GCC GCC GTC 
Thr Leu Thr Ser Ser Met Ala His Val Asp Gin Tyr Leu Ala Ala Val 
245 250 255 

ATC AAG AAG CGC AAG CTC GAG CTC GCC GCC GGC AAC GGC AAA TGC GAC 
lie Lys Lys Arg Lys Leu Glu Leu Ala Ala Gly Asn Gly Lys Cys Asp 
260 265 270 

ACS GCG GCG ACG CAC GAC GAC CTG CTC TCC CGG TTC ATG CGG AAG GGT 
Thr Ala Ala Thr His Asp Asp Leu Leu Ser Arg Phe Mec Arg Lys Gly 
275 280 285 290 

TCC TAC TCG GAC GAG TCG CTC CAG CAC GTG GCG CTC AAC TTC ATC CTC 
Ser Tyr Ser Asp Glu Ser Leu Gin His Val Ala Leu Asn Phe lie Leu 

295 300 305 

GCC GGC CGC GAC ACC TCC TCC GTG GCG CTC TCC TGG TTC TTC TGG CTC 
Ala Gly Arg Asp Thr Ser Ser Val Ala Leu Ser Trp Phe Phe Trp Leu 
310 315 320 

GTG TCC ACC CAC CCT GCG GTG GAG CGC AAG ATC GTG CGC GAG CTC TGC 
Val Ser Thr His Pro Ala Val Glu Arg Lys lie Val Arg Glu Leu Cys 
325 330 335 

TCC GTT CTC GCC GCG TCA CGG GGC GCC CAT GAC CCG GCA TTG TGG CTG 
Ser val Leu Ala Ala Ser Arg Gly Ala His Asp Pro Ala Leu Trp Leu 
340 345 350 

GCG GAG CCC TTC ACC TTC GAG GAG CTC GAC CGC CTG GTC TAC CTC AAG 
Ala Glu Pro Phe Thr Phe Glu Glu Leu Asp Arg Leu Val Tyr Leu Lye 
355 360 365 370 

GCG GCG CTG TCG GAG ACC CTC CGC CTC TAC CCC TCC GTC CCC GAG GAC 
Ala Ala Leu Ser Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro Glu Asp 
375 380 385 
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TCC AAG CAC GTC GTC GCG GAC GAC TAC CTC CCC GAC GGC ACC TTC GTG 1317 
Ser Lys His Val Val Ala Asp Asp Tyr Leu Pro Asp Gly Thr Phe Val 
390 395 400 

CCG GCC GGG TCG TCG GTC ACC TAC TCC ATA TAC TCG GCG GGG CGC ATG 13 65 

Pro Ala Gly Ser Ser Val Thr Tyr Ser lie Tyr Ser Ala Gly Arg Met 
405 410 415 

AAG GGG GTG TCG GGG GAG GAC TGC CTC GAG TTC CGG CCG GAG CGA TGG 1413 
Lys Gly Val Trp Gly Glu Asp Cys Leu Glu Phe Arg Pro Glu Arg Trp 
420 425 430 

CTG TCG GCC GAC GGC ACC AAG TTC GAG CAG CAC GAC TCG TAC AAG TTC 1461 
Leu Ser Ala Asp Gly Thr Lys Phe Glu Gin His Asp Ser Tyr Lys Phe 
435 440 44S 450 

GTG GCG TTC AAC GCC GGG CCG AGG GTG TGC CTG GGC AAG GAC CTA GCC 1509 
Val Ala Phe Asn Ala Gly Pro Arg Val Cys Leu Gly Lys Asp Leu Ala 
455 460 465 

TAC CTG CAG ATG AAG AAC ATC GCC GGG AGC GTG CTG CTC CGG CAC CGC 1557 
Tyr Leu Gin Met Lys Asn lie Ala Gly Ser Val Leu Leu Arg His Arg 
470 475 480 

CTG ACC GTG GCG CCG GGC CAC CGC GTG GAG CAG AAG ATG TCG CTC ACG 1605 
Leu Thr Val Ala Pro Gly His Arg Val Glu Gin Lys Met Ser Leu Tor 
485 490 495 

CTC TTC ATG AAG GGC GGG CTA CGG ATG GAG GTA CGT CCG CGC GAC CTC 1653 
Leu Phe Mec Lys Gly Gly Leu Arg Met Glu Val Arg Pro Arg Asp Leu 
500 505 510 

GCC CCC GTC CTC GAC GAG CCC TGC GGC CTG GAC GCC GGC GCC GCC ACC 1701 
Ala Pro Val Leu Asp Glu Pro Cys Gly Leu Asp Ala Gly Ala Ala Thr 
515 520 525 530 

GCC GCC GCA GCA ACT GCC ACA GCG CCG TGC GCG TAGAAGACCT GGCACCGGCA 1754 
Ala Ala Ala Ala Ser Ala Thr Ala Pro Cys Ala 
535 540 

CGCGCCATGC ATGATTCGTG CGTGCTAGCT GTTGAAGGGA CGCCGGACAT TGAATGTGTA 1814 

6ATAGGGCAG CAGTGCAAGA CCGTAAGTAA AATTGATGAT GG G TTT G GTG ACAACATTGA 1874 

AGCCACTCCT TTCCAGAATT TACGACCCGO ATAGGAGAAA CAGGGAAACT TTGCAGATCA 1934 

CAACACAAGA TCTAGCCAGC CGGGGATCTG ATCTGATTTG CGTCTGCTCG GAGCACGGGT 1994 

GCATGGGAGA CCAAGGAGGA AAACAAAAAA TAACAGAAAC AGAGTGAGCA ATATTTGTGA 2054 

TTGTAGCCAC GGGAAAGAGA GAGGAGTAAT TAGTAATTCA GATTTGTTTG CAGTAGCTCG 2114 

G T C TT GG T G A CCAGATCATA GCCAACTAGG CTATTCTATT CTATTCTATT TTTGAAGATG 2174 

ATTTTTC 2181 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 0 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
ATATATGGAT CCATGGAGGT GGGGACGTGG GCGGTGGTG 39 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 0 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATATATGGAT CCATGGAAGT TGGTACTTGG GCTGTTGTTG TTTCTGCTGT TGCTGCTTAT SO 
ATGGCTTGGT TTTGGAGAAT GTCTAGAGGT TTGAGAGGTC CAAGAGTTTG GCCAGTTTTG 120 
GGTTCTTTGC CAGOCCTGCT GCAGCACGCC 150 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 2 base pairs 

(B) TYPE: nucleotide 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "reverse" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TATATAGAAT TCCTTCTACG CGCACGGCGC TGTGGCACTT GC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1526 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1623 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATG GAA GTT GOT ACT TGG GCT GTT GTT GTT TCT GCT GTT GCT GCT TAT 
Mat GXu Val Gly Thr Trp Ala Val Val Val Sar Ala Val Ala Ala Tyr 
15 10 15 

ATG GCT TGG TTT TGG AGA ATG TCT AGA GGT TTG AGA GGT CCA AGA GTT 
Mat Ala Trp Pha Trp Arg Mee Ser Arg Gly Lau Arg Gly Pro Arg Val 
20 2S 30 

TGG CCA GTT TTG GGT TCT TTG CCA GGC CTG GTG CAG CAC GCC GAG GAC 
Trp Pro Val Lau Gly Sar Lau Pro Gly Lau Val Gin Hia Ala Glu Asp 
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ATG CAC GAG TGG ATC GCC GGC AAC CTG CGC CGC GCG GGC GGC ACG TAC 192 
Met His Glu Trp lie Ala Gly Asn Leu Arg Arg Ala Gly Gly Thr Tyr 
50 5S 60 

CAG ACC TGC ATC TTC GCC GTG CCC GGG GTG GCG CGC CGC GGC GGC CTG 240 
Gin Thr Cys lie phe Ala Val Pro Gly Val Ala Arg Arg Gly Gly Leu 
65 ?o 73 80 

GTC ACC GTC ACC TGC GAC CCG CGC AAC CTG GAG CAC GTC CTG AAG GCG 288 
Val Thr Val Thr Cys Asp Pro Arg Asn Leu Glu His Val Leu Lys Ala 
SS 90 95 

CGC TTC GAC AAC TAC CCC AAG GGC CCC TTC TGG CAC GGC GTC TTC CGG 336 
Arg Phe Asp Asn Tyr Pro Lys Gly Pro Phe Trp His Gly Val Phe Arg 
100 105 110 

GAC CTG CTC GGC GAC GGC ATC TTC AAT TCC GAC GGC GAC ACC TGG CTC 384 
Asp Leu Leu Gly Asp Gly lie Phe Asn Ser Asp Gly Asp Thx Trp Leu 

115 120 125 

GCG CAG CGC AAG ACC GCC GCG CTC GAG TTC ACC ACC CGC ACG CTC CGG 432 
Ala Gin Arg Lys Thr Ala Ala Leu Glu Phe Thr Thr Arg Thr Leu Arg 
130 135 140 

ACG GCC ATG TCC CGC TGG GTC TCG CGC TCC ATC CAC GGC CGC CTC CTG 480 
Thr Ala Met Ser Arg Trp Val Ser Arg Ser lie His Gly Arg Leu Leu 
145 150 155 160 

CCC ATC CTG GCC GAC GCG GCC AAG GGC AAG GCG CAG GTG GAT CTC CAG S28 
Pro lie Leu Ala Asp Ala Ala Lys Gly Lys Ala Gin Val Asp Leu Gin 
165 170 175 

GAC CTC CTC CTC CGC CTC ACC TTC GAC AAC ATC TGC GGC CTG GCC TTC 576 
Asp Leu Leu Leu Arg Leu Thr Phe Asp Asn lie Cys Gly Leu Ala Phe 
180 185 190 

GGC AAG GAC CCG GAG ACG CTC GCC CAG GGC CTG CCG GAG AAC GAG TTC 624 
Gly Lys Asp Pro Glu Thr Leu Ala Gin Gly Leu Pro Glu Asn Glu Phe 
195 200 205 

GCC TCC GCG TTC GAC CGC GCC ACC GAG GCC ACG CTC AAC CGC TTC ATC 672 
Ala Ser Ala Phe Asp Arg Ala Thr Glu Ala Thr Leu Asn Arg Phe lie 
210 215 220 

TTC CCG GAG TTC CTG TGG CGC TGC AAA AAG TGG CTG GGC CTC GGC ATG 720 
Phe Pro Glu Phe Leu Trp Arg Cys Lys Lys Trp Leu Gly Leu Gly Met 
225 230 235 240 

GAG ACC ACG CTG ACC AGC AGC ATG GCC CAC GTC GAC CAG TAC CTC GCC 768 

Glu Tax Thr Leu Tar Ser Ser Met Ala His Val Asp Gin Tyr L«u Ala 

245 250 2S5 

GCC GTC ATC AAG AAG CGC AAG CTC GAG CTC GCC GCC GGC AAC GGC AAA 816 
Ala Val He Lys Lys Arg Lys Leu Glu Leu Ala Ala Gly Asn Gly Lys 
260 265 270 

TGC GAC ACG GCG GCG ACG CAC GAC GAC CTG CTC TCC CGG TTC ATG CGG 864 
Cys Asp Thr Ala Ala Thr His Asp Asp Leu Leu Ser Arg Phe Met Arg 
275 280 285 

AAG GGT TCC TAC TCG GAC GAG TCG CTC CAG CAC GTG GCG CTC AAC TTC 912 
Lys Gly Ser Tyr Ser Asp Glu Ser Leu Gin His Val Ala Leu Asn Phe 
290 295 300 

ATC CTC GCC GGC CGC GAC ACC TCC TCC GTG GCG CTC TCC TGG TTC TTC 960 
lie Leu Ala Gly Arg Asp Thr Ser Ser Val Ala Leu Ser Trp Phe Phe 

305 310 315 320 
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TGG CTC GTG TCC ACC CAC CCT GCG GTG GAG CGC AAG ATC GTG CGC GAG 
Trp Leu Val Ser Thr His Pro Ala Val Glu Arg Lys tie Val Arg Glu 

32S 330 335 

CTC TGC TCC GTT CTC GCC GCG TCA CGG GGC GCC CAT GAC CCG GCA TTG 
Leu CVS Ser Val Leu Ala Ala Ser Arg Sly Al* His Asp Pro Ala Leu 
340 34S 350 

TGG CTG GCG GAG CCC TTC ACC TTC GAG GAG CTC GAC CGC CTG GTC TAC 

Trp Leu Ala Glu Pro Phe Thr Phe Glu Glu Leu Asp Arg Leu Val Tyr 

355 360 365 

CTC AAG GCG GCG CTG TCG GAG ACC CTC CGC CTC TAC CCC TCC GTC CCC 
Leu Lys Ala Ala Leu Ser Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro 
370 375 380 

GAG GAC TCC AAG CAC GTC GTC GCG GAC GAC TAC CTC CCC GAC GGC ACC 
Glu Asp Ser Lys His Val Val Ala Asp Asp Tyr Leu Pro Asp Gly Thr 
385 J 390 395 400 

TTC GTG CCG GCC GGG TCG TCG GTC ACC TAC TCC ATA TAC TCG GCG GGG 
Phe val Pro Ala Gly Ser Ser Val Thr Tyr Ser lie Tyr Ser Ala Gly 
405 410 415 

CGC ATG AAG GGG GTG TGG GGG GAG GAC TGC CTC GAG TTC CGG CCG GAG 
Arg Met Lys Gly Val Trp Gly Glu Asp Cys Leu Glu Phe Arg Pro Glu 
420 425 430 

CGA TGG CTG TCG GCC GAC GGC ACC AAG TTC GAG CAG CAC GAC TCG TAC 
Arg Trp Leu Ser Ala Asp Gly Thr Lys Phe Glu Gin His Asp Ser Tyr 
435 440 445 

AAG TTC GTG GCG TTC AAC GCC GGG CCG AGG GTG TGC CTG GGC AAG GAC 
Lys Phe Val Ala Phe Asn Ala Gly Pro Arg Val Cys Leu Gly Lys Asp 

450 455 460 

CTA GCC TAC CTG CAG ATG AAG AAC ATC GCC GGG AGC GTG CTG CTC CGG 
Leu Ala Tyr Leu Gin Mec Lys Asn lie Ala Gly Ser Val Leu Leu Arg 
465 470 475 480 

CAC CGC CTG ACC GTG GCG CCG GGC CAC CGC GTG GAG CAG AAG ATG TCG 
His Arg Leu Thr Val Ala Pro Gly His Arg Val Glu Gin Lys Met Ser 
485 490 495 

CTC ACG CTC TTC ATG AAG GGC GGG CTA CGG ATG GAG GTA CGT CCG CGC 
Leu Thr Leu Phe Met Lys Gly Gly Leu Arg Met Glu Val Arg Pro Arg 
500 505 510 

GAC CTC GCC CCC GTC CTC GAC GAG CCC TGC GGC CTG GAC GCC GGC GCC 
Asp Leu Ala Pro Val Leu Asp Glu Pro Cys Gly Leu Asp Ala Gly Ala 
SIS 520 525 

GCC ACC GCC GCC GCA GCA AGT GCC ACA GCG CCG TGC GCG TAG 

Ala Thr Ala Ala Ala Ala Ser Ala Tar Ala Pro Cys Ala 

530 535 540 
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CLAIMS 

1 . DNA sequence which encodes a protein of 
interest which contains regions having a high content 
of codons which are poorly suited to yeasts, 
5 characterized in that a sufficient number of codons 
which are poorly suited to yeasts is replaced with 
corresponding codons which are well-suited to yeasts in 
the said regions having a high content of codons which 
are poorly suited to yeasts. 

10 2. Sequence according to claim 1, 

characterized in that the codons which are poorly 
suited to yeasts are selected from among codons whose 
frequency of use by yeasts is less than or equal to 
approximately 13 per 1000, preferably less than or 

15 equal to approximately 12 per 1000, more preferably 
less than or equal to approximately 10 per 1000. 

3. Sequence according to claim 2, 
characterized in that the codons which are poorly 
suited to yeasts are selected from among codons CTC, 

2 0 CTG and CTT, which encode leucine, codons CGG, CGC, 

CGA, CGT and AGG, which encode arginine, codons GCG and 
GCC, which encode alanine, codons GGG, GGC and GGA, 
which encode glycine, and codons CCG and CCC, which 
encode proline. 

25 4. Sequence according to claim 3, 

characterized in that the codons which are poorly 
suited to yeasts are selected from among codons CTC and 
CTG, which encode leucine, codons CGG, CGC, CGA, CGT 
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and AGG, which encode arginine, codons GCG and GCC, 
which encode alanine, codons GGG and GGC, which encode 
glycine, and codons CCG and CCC, which encode proline. 

5 . Sequence according to one of claims 1 to 
5 4, characterized in that the corresponding codons which 

are well-suited to yeasts are selected from among 
codons which correspond to the codons which are poorly 
suited to yeasts and which encode the same amino acids, 
and whose frequency of use by yeasts is greater than 15 
10 per 1000, preferably greater than or equal to 18 per 
1000, more preferably greater than or equal to 20 per 
1000. 

6. Sequence according to claim 5, 
characterized in that the corresponding codons which 

15 are well-suited to yeasts are selected from among 
codons TTG and TTA, preferably TTG, which encode 
leucine, codon AGA, which encodes arginine, codons GCT 
and GCA, preferably GCT, which encode alanine, codon 
GGT, which encodes glycine, and codon CCA, which 

20 encodes proline. 

7. Sequence according to one of claims 1 to 
7, characterized in that the regions having a high 
content of codons which are poorly suited to yeasts 
contain at least 2 poorly suited codons among 10 

25 consecutive codons, with it being possible for the two 
codons to be- adjacent or separated by up to 8 other 
codons . 

8. Sequence according to claim 7, 



45 

characterized in that the regions having a high content 
of poorly suited codons contain 2, 3, 4, 5 or 6 poorly 
suited codons per 10 consecutive codons, or contain at 
least 2 or 3 adjacent poorly suited codons. 
5 9. DNA, in particular cDNA, sequence which 

encodes a protein of interest which contains regions 
having a high content of leucine, characterized in that 
a sufficient number of CTC codons encoding leucine in 
the said region having a high content of leucine is 
10 replaced with TTG and/or TTA codons, or in that a 
sufficient number of CTC and CTG codons encoding 
leucine in the said region having a high content of 
leucine is replaced with TTG and/or TTA codons . 

10. Sequence according to claim 9, 

15 characterized in that the CTC or CTC and CTG codons are 
replaced with a TTG codon. 

11. Sequence according to one of claims 9 or 

10, characterized in that the regions having a high 
content of leucine contain 2, 3, 4, 5 or 6 leucines per 

20 10 consecutive amino acids, or contain at least 2 or 3 
adjacent leucines. 

12 . Sequence according to one of claims 1 to 

11, characterized in that the general content of poorly 
suited codons is at least 20%, more preferably at least 

25 30%, as compared with the total number of codons. 

13. Sequence according to one of claims 1 to 

12, characterized in that it contains at least one 5' 
region having a high content of codons which are poorly 
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suited to yeasts. 

14. Sequence according to claim 13, 
characterized in that the codons which are poorly 
suited to yeasts are replaced only in this 5' region. 
5 15. Sequence according to one of -claims 1 to 

14, characterized in that it is an isolated DNA 
sequence of natural origin, in particular of plant 
origin . 

16. Sequence according to claim 15, 

10 characterized in that it originates from dicotyledonous 
or monocotyledonous plants, in particular from 
monocotyledonous plants. 

17. Sequence according to claim 16, 
characterized in that it originates from plants of the 

15 graminae family, which are selected, in particular, 
from among wheat, barley, oats, rice, maize, sorghum 
and cane sugar. 

18 . Sequence according to one of claims 1 to 
17, characterized in that it encodes an enzyme. 

2 0 19. Sequence according to claim 18, 

characterized in that it encodes a cytochrome P450. 

20. Sequence according to claim 19, 
characterized in that the sequence which contains 
regions having a high content of codons which are 

2 5 poorly suited to yeasts includes the coding region of 
the sequences ID No. 1 or ID No. 10. 

21. Sequence according to claim 19, 
characterized in that it is one of the sequences ID 
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No. 7, ID No. 8, ID No. 9 and ID No. 13. 

22. Chimeric gene which contains a modified 
DNA sequence according to one of claims 1 to 21 and 
heterologous 5 ' and 3 ' regulatory elements which are 

5 able to function in a yeast. 

23 . Vector for transforming yeasts which 
contains at least one chimeric gene according to claim 
22 . 

24. Process for transforming yeasts using a 
10 vector according to claim 23 . 

25. Transformed yeast for expressing a 
protein of interest, characterized in that it contains 
a chimeric gene according to claim 22. 

26. Yeast according to claim 25, 

15 characterized in that it is selected from among the 

genera Saccharomyces , Kluyveromyces , Hansenula, Pichia 
and Yarrowia, advantageously from the genus 
Saccharomyces, in particular S. cerevisiae. 

27. Process for producing a heterologous 
2 0 protein of interest in a transformed yeast, 

characterized in that it comprises the steps of: 

a) transforming a yeast with a vector 
according to claim 23 which contains a modified DNA 
sequence according to one of claims 1 to 21 and 

25 heterologous 5' and 3' regulatory elements which are 
able to function in a yeast, 

b) culturing the transformed yeast, and 

c) extracting the protein of interest from 



the yeast culture. 

28. Process for transforming a substrate by 
enzymic catalysis using an enzyme which is expressed in 
a yeast, which process comprises the steps of 

a) culturing, in the presence of the 
substrate to be transformed, the yeast which has been 
transformed with a vector according to claim 23 which 
contains a modified DNA sequence according to one of 
claims 1 to 21 and heterologous 5' and 3' regulatory 
elements which are able to function in a yeast, and 
then 

b) recovering the transformed substrate from 
the yeast culture. 
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THE RECODING OF DNA SEQUENCES TO ENABLE THEM TO BE 
EXPRESSED IN YEASTS, AND THE TRANSFORMED YEASTS 
OBTAINED 

Abstract 

The present invention relates to a DNA 
sequence which encodes a protein of interest which 
contains regions having a high content of codons which 
are poorly suited to yeasts, characterized in that a 
sufficient number of codons which are poorly suited to 
yeasts is replaced with corresponding codons which are 
well-suited to yeasts in the said regions having a high 
content of codons which are poorly suited to yeasts. 

The present invention relates, more 
specifically, to DNA sequences which originate from 
dicotyledonous or monocotyledonous plants, in 
particular plants of the graminae family which are 
selected, in particular, from among wheat, barley, 
oats, rice, maize, sorghum and cane sugar. 

The present invention also relates to 
transformed yeasts which contain a DNA sequence 
according to the invention. 
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SEQUENCE LISTING 



<110> Batard, Yannick 
Durst, Francis 
Schalk, Michel 
Werck-Reichhart , Daniele 

<12 0> RECODING OF DNA SEQUENCES PERMITTING 

EXPRESSION IN YEAST AND OBTAINED TRANSFORMED YEAST 



<130> A32000 

<140> 09/158,767 
<141> 1998-09-23 

<150> FR 97-12094 
<151> 1997-09-24 

<160> 20 

<170> FastSEQ for Windows Version 3.0 

<210> 1 

<211> 2261 

<212> DNA 

<213> Triticum aestivum 



<400> 1 



cgcagcacgg caacacatac 
ctcctggaga aggccctcct 
aagctcaccg gcaagcgctt 
aactggctgc aggtcggcga 
ttcggcgagg tgttcctcct 
gagctcgcca aggaggtcct 
gtcgtcttcg acatcttcac 
cactggcgca agatgcggcg 
cagaaccgcg tggggtggga 
ccggcggcgg cgacggcggg 
gacatgttcc gcatcatgtt 
cagctcaagg cgctcaacgc 
ggcgacttca tccccgtcct 
ctcaagacca agcggatgaa 
ttggagaaga cgggtgagat 



acaggagcca cacaccgcac 
gggcctcttc gccgcggcgg 
ccgcctcccc cctggcccct 
cgacctcaac caccgcaacc 
ccgcatgggc gtccgcaacc 
ccacacccag ggcgtcgagt 
cggcaaggga caggacatgg 
gatcatgacg gtgcccttct 
99 a 99 a 99 cc cggctggtgg 
cgtggtggtc cgccgcaggc 
cgaccgccgg ttcgagagcg 
cgagcgcagc atcctctccc 
ccgccccttc ctccgccgct 
ggtgttcgag gaccacttcg 
caggtgcgcc atggaccaca 



ctaccccgat 
tgctggccat 
ccggcgcccc 
tgatgggcct 
tggtggtcgt 
tcggctcccg 
tgttcacggt 
tcaccaacaa 
tggaggacct 
tgcagctcat 
tggccgaccc 
agagcttcga 
acctcaaccg 
tccagcagcg 
tcctggaagc 



ggacgtcctc 
cgccgtcgcc 
catcgtcggc 
ggccaagcgg 
ctccagcccc 
cacccgcaac 
gtacggcgac 
ggtggtggcg 
caaggccgac 
gatgtacaac 
gctcttcaac 
ctacaactac 
ctgcaccaac 
caaggaggcg 
cgaaaggaag 



60 
120 
180 
240 



300 



360 
420 
480 
540 
600 
660 
720 
780 
840 



900 



ggcgagatca accacgacaa cgtcctctac atcgtcgaga acatcaacgt cgcagccatc 960 

gagacgacgc tgtggtcgat cgagtggggc ctcgcggagc tggtgaacca cccggagatc 102 0 

cagcagaagc tgcgcgagga gatcgtcgcc gttctgggcg ccggcgtggc ggtgacggag 1080 

ccggacctgg agcgcctccc ctacctgcag tccgtggtga aggagacgct ccgcctccgc 114 0 

atggcaat cc cgctcctggt gccgcacatg aacctcagcg acgccaagct cgccggctac 12 00 

gacatccccg ccgagtccaa gatcctcgtc aacgcctggt tcctcgccaa cgaccccaag 1260 

cggtgggtgc gcgccgatga gttcaggccg gagaggttcc tcgaggagga gaaggccgtc 132 0 

gaggcccacg gcaacgattt ccggttcgtg cccttcggcg tcggccgccg gagctgcccc 1380 

gggatcatcc tcgcgctgcc catcatcggc atcacgctcg gacgcctggt gcagaacttc 1440 

cagctgctgc cgccgccggg gcaggacaag atcgacacca ccgagaagcc cgggcagttt 1500 

accaaccaga tcctcaagca cgccaccatt gtctgcaagc cactcgaggc ttaactgaat 1560 

tgaggtttcg gtcatgggcg cccgctgacg cggggagatg gatctatgca tgtgactgtg 1620 

tattttgcct tctttctttt tggtgttgtt ttttgcagta gtaagtttaa tttttctttg 1680 

gtgttggcct atttgtcttc atgtgaggcg tcgtgttgta aatttccata tagttggcaa 1740 

tgtgatgtaa aacttggctc caaaaaaaaa aaaaaaaaac tcgagactct tctctctctc 1800 

tctctctctc cagcctcggg tctctgctgg caagggaact tgcattaccc tgtgtacgac 1860 

ggcgccatgt tcgtccctga agcaccctcc ctgcagagct cccaggacaa cttcgctgca 192 0 

ftgtgctggtt tcaagcgtcg aaggagagag ttttgaatac ccgaaagaat atagcgttgg 1980 

aBatatctgt caaacagggg atcttgctgt gggtctcttg gtgggccaaa tcgcatagac 204 0 

a^tcattcaa atggatgggt tcttcgctgg tcggtcaaaa agtatatgtt gtaattgtac 2100 

gpjcttttttg ggtcttgttg ccaaagatca tggttattga gttgtgagct ctgagataac 2160 

a^gtttgtgt atagtgaaat aaagaggagc gtcgtcaaca ccatgtacta tataggcttt 222 0 

gSaattccat taagatgcat cagaaatcaa tgttggattt g 2261 

!\ <210> 2 
[! . <211> 38 
tl <212> DNA 

|Jl <213> Artificial Sequence 

Q <220> 

<223> Synthetic primer 

<400> 2 

atatatggat ccatggacgt cctcctcctg gagaaggc 38 

<210> 3 
<211> 56 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic primer 



<400> 3 

atatatggat ccatggatgt tttgttgttg gagaaggccc tcctgggcct cttcgc 



56 



I 



<210> 4 
<211> 71 
<212> DNA 

<213> Triticum aestivum 
<220> 

<223> Synthetic primer 
<400> 4 

atatatggat ccatggatgt tttgttgttg gaaaaagctt tgttgggttt gttcgccgcg 60 
gcggtgctgg c 71 

<210> 5 
<211> 143 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic primer 
<400> 5 

atatatggat ccatggatgt tttgttgttg gaaaaagctt tgttgggttt gtttgctgct 60 
gctgttttgg ctattgctgt tgctaaattg actggtaaaa gatttagatt gccaccaggt 12 0 

ccatccggcg cccccatcgt egg 143 

<210> 6 
H <211> 39 

Jfj <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Synthetic primer 
<400> 6 

tatatagaat tccagttaag cctcgagtgg ettgeagae 3 9 

<210> 7 
<211> 1506 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 



<400> 7 



atggatgttt tgttgttgga gaaggccctc ctgggcctct tcgccgcggc ggtgctggcc 60 

atcgccgtcg ccaagctcac cggcaagcgc ttccgcctcc cccctggccc ctccggcgcc 12 0 

cccatcgtcg gcaactggct gcaggtcggc gacgacctca accaccgcaa cctgatgggc 180 

ctggccaagc ggttcggcga ggtgttcctc ctccgcatgg gcgtccgcaa cctggtggtc 24 0 

gtctccagcc ccgagctcgc caaggaggtc ctccacaccc agggcgtcga gttcggctcc 3 00 

cgcacccgca acgtcgtctt cgacatcttc accggcaagg gacaggacat ggtgttcacg 3 60 

gtgtacggcg accactggcg caagatgcgg cggatcatga cggtgccctt cttcaccaac 42 0 

aaggtggtgg cgcagaaccg cgtggggtgg gaggaggagg cccggctggt ggtggaggac 480 

ctcaaggccg acccggcggc ggcgacggcg ggcgtggtgg tccgccgcag gctgcagctc 540 

atgatgtaca acgacatgtt ccgcatcatg ttcgaccgcc ggttcgagag cgtggccgac 600 

ccgctcttca accagctcaa ggcgctcaac gccgagcgca gcatcctctc ccagagcttc 660 

gactacaact acggcgactt catccccgtc ctccgcccct tcctccgccg ctacctcaac 72 0 

cgctgcacca acctcaagac caagcggatg aaggtgttcg aggaccactt cgtccagcag 780 

cgcaaggagg cgttggagaa gacgggtgag atcaggtgcg ccatggacca catcctggaa 840 

gccgaaagga agggcgagat caaccacgac aacgtcctct acatcgtcga gaacatcaac 900 

gtcgcagcca tcgagacgac gctgtggtcg atcgagtggg gcctcgcgga gctggtgaac 960 

e&cccggaga tccagcagaa gctgcgcgag gagatcgtcg ccgttctggg cgccggcgtg 102 0 

spbggtgacgg agccggacct ggagcgcctc ccctacctgc agtccgtggt gaaggagacg 1080 

gtccgcctcc gcatggcaat cccgctcctg gtgccgcaca tgaacctcag cgacgccaag 1140 

Etcgccggct acgacatccc cgccgagtcc aagatcctcg tcaacgcctg gttcctcgcc 1200 

a^cgacccca agcggtgggt gcgcgccgat gagttcaggc cggagaggtt cctcgaggag 12 60 

gagaaggccg tcgaggccca cggcaacgat ttccggttcg tgcccttcgg cgtcggccgc 132 0 

cjjgagctgcc ccgggatcat cctcgcgctg cccatcatcg gcatcacgct cggacgcctg 1380 

gtgcagaact tccagctgct gccgccgccg gggcaggaca agatcgacac caccgagaag 144 0 

dbtcgggcagt ttaccaacca gatcctcaag cacgccacca ttgtctgcaa gccactcgag 1500 

fdttaa 1506 

-J <210> 8 
U <211> 1506 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 
<400> 8 

atggatgttt tgttgttgga aaaagctttg ttgggtttgt tcgccgcggc ggtgctggcc 60 

atcgccgtcg ccaagctcac cggcaagcgc ttccgcctcc cccctggccc ctccggcgcc 12 0 

cccatcgtcg gcaactggct gcaggtcggc gacgacctca accaccgcaa cctgatgggc 180 

ctggccaagc ggttcggcga ggtgttcctc ctccgcatgg gcgtccgcaa cctggtggtc 240 

gtctccagcc ccgagctcgc caaggaggtc ctccacaccc agggcgtcga gttcggctcc 3 00 

cgcacccgca acgtcgtctt cgacatcttc accggcaagg gacaggacat ggtgttcacg 3 60 

gtgtacggcg accactggcg caagatgcgg cggatcatga cggtgccctt cttcaccaac 42 0 

aaggtggtgg cgcagaaccg cgtggggtgg gaggaggagg cccggctggt ggtggaggac 48 0 

ctcaaggccg acccggcggc ggcgacggcg ggcgtggtgg tccgccgcag gctgcagctc 54 0 



1 



1 



atgatgtaca acgacatgtt ccgcatcatg ttcgaccgcc ggttcgagag cgtggccgac 
ccgctcttca accagctcaa ggcgctcaac gccgagcgca gcatcctctc ccagagcttc 
gactacaact acggcgactt catccccgtc ctccgcccct tcctccgccg ctacctcaac 
cgctgcacca acctcaagac caagcggatg aaggtgttcg aggaccactt cgtccagcag 
cgcaaggagg cgttggagaa gacgggtgag atcaggtgcg ccatggacca catcctggaa 
gccgaaagga agggcgagat caaccacgac aacgtcctct acatcgtcga gaacatcaac 
gtcgcagcca tcgagacgac gctgtggtcg atcgagtggg gcctcgcgga gctggtgaac 
cacccggaga tccagcagaa gctgcgcgag gagatcgtcg ccgttctggg cgccggcgtg 
gcggtgacgg agccggacct ggagcgcctc ccctacctgc agtccgtggt gaaggagacg 
ctccgcctcc gcatggcaat cccgctcctg gtgccgcaca tgaacctcag cgacgccaag 
ctcgccggct acgacatccc cgccgagtcc aagatcctcg tcaacgcctg gttcctcgcc 
aacgacccca agcggtgggt gcgcgccgat gagttcaggc cggagaggtt cctcgaggag 
gagaaggccg tcgaggccca cggcaacgat ttccggttcg tgcccttcgg cgtcggccgc 
cggagctgcc ccgggatcat cctcgcgctg cccatcatcg gcatcacgct cggacgcctg 
gtgcagaact tccagctgct gccgccgccg gggcaggaca agatcgacac caccgagaag 
cccgggcagt ttaccaacca gatcctcaag cacgccacca ttgtctgcaa gccactcgag 
gettaa 
CI 

>J <210> 9 

fl <211> 1506 

: : ~ <212> DNA 

J <213> Artificial Sequence 
s <220> 

L* <223> Altered sequences 
H <400> 9 

llggatgttt tgttgttgga aaaagctttg ttgggtttgt ttgctgctgc tgttttggct 
adtgctgttg ctaaattgac tggtaaaaga tttagattgc caccaggtcc atccggcgcc 
cccatcgtcg gcaactggct gcaggtcggc gacgacctca accaccgcaa cctgatgggc 
ctggccaagc ggttcggcga ggtgttcctc ctccgcatgg gcgtccgcaa cctggtggtc 
gtctccagcc ccgagctcgc caaggaggtc ctccacaccc agggcgtcga gttcggctcc 
cgcacccgca acgtcgtctt cgacatcttc accggcaagg gacaggacat ggtgttcacg 
gtgtacggcg accactggcg caagatgcgg cggatcatga cggtgccctt cttcaccaac 
aaggtggtgg cgcagaaccg cgtggggtgg gaggaggagg cccggctggt ggtggaggac 
ctcaaggccg acccggcggc ggcgacggcg ggcgtggtgg tccgccgcag gctgcagctc 
atgatgtaca acgacatgtt ccgcatcatg ttcgaccgcc ggttcgagag cgtggccgac 
ccgctcttca accagctcaa ggcgctcaac gccgagcgca gcatcctctc ccagagcttc 
gactacaact acggcgactt catccccgtc ctccgcccct tcctccgccg ctacctcaac 
cgctgcacca acctcaagac caagcggatg aaggtgttcg aggaccactt cgtccagcag 
cgcaaggagg cgttggagaa gacgggtgag atcaggtgcg ccatggacca catcctggaa 
gccgaaagga agggcgagat caaccacgac aacgtcctct acatcgtcga gaacatcaac 
gtcgcagcca tcgagacgac gctgtggtcg atcgagtggg gcctcgcgga gctggtgaac 
cacccggaga tccagcagaa gctgcgcgag gagatcgtcg ccgttctggg cgccggcgtg 
gcggtgacgg agccggacct ggagcgcctc ccctacctgc agtccgtggt gaaggagacg 



I 



ctccgcctcc gcatggcaat cccgctcctg 
ctcgccggct acgacatccc cgccgagtcc 
aacgacccca agcggtgggt gcgcgccgat 
gagaaggccg tcgaggccca cggcaacgat 
cggagctgcc ccgggatcat cctcgcgctg 
gtgcagaact tccagctgct gccgccgccg 
cccgggcagt ttaccaacca gatcctcaag 
gcttaa 



gtgccgcaca tgaacctcag cgacgccaag 
aagatcctcg tcaacgcctg gttcctcgcc 
gagttcaggc cggagaggtt cctcgaggag 
ttccggttcg tgcccttcgg cgtcggccgc 
cccatcatcg gcatcacgct cggacgcctg 
gggcaggaca agatcgacac caccgagaag 
cacgccacca ttgtctgcaa gccactcgag 



<210> 10 
<211> 2181 
<212> DNA 

<213> Triticum aestivum 



<400> 10 

cgatccaccc cttggatcca ctctacccag 
ffcacgtacgc gcgtacgtac actcgcagag 
||gacgtggg cggtggtggt gtcggcggtg 
tpccgcgggc tgcgcgggcc gcgggtttgg 
IJgcacgccg aggacatgca cgagtggatc 
IJccagacct gcatcttcgc cgtgcccggg 
ajpctgcgacc cgcgcaacct ggagcacgtc 
ggccccttct ggcacggcgt cttccgggac 
ggcgacacct ggctcgcgca gcgcaagacg 
<fc§gacggcca tgtcccgctg ggtctcgcgc 
f&cgacgcgg ccaagggcaa ggcgcaggtg 
tjcgacaaca tctgcggcct ggccttcggc 
(p^ggagaacg agttcgcctc cgcgttcgac 
Mcttcccgg agttcctgtg gcgctgcaaa 
ctgaccagca gcatggccca cgtcgaccag 
ctcgagctcg ccgccggcaa cggcaaatgc 
tcccggttca tgcggaaggg ttcctactcg 
ttcatcctcg ccggccgcga cacctcctcc 
tccacccacc ctgcggtgga gcgcaagatc 
tcacggggcg cccatgaccc ggcattgtgg 
gaccgcctgg tctacctcaa ggcggcgctg 
cccgaggact ccaagcacgt cgtcgcggac 
gccgggtcgt cggtcaccta ctccatatac 
gaggactgcc tcgagttccg gccggagcga 
cagcacgact cgtacaagtt cgtggcgttc 
gacctagcct acctgcagat gaagaacatc 
accgtggcgc cgggccaccg cgtggagcag 
gggctacgga tggaggtacg tccgcgcgac 
ctggacgccg gcgccgccac cgccgccgca 
acctggcacc ggcacgcgcc atgcatgatt 



ctcgctagcc agcggggtac atacacgcac 

cttgcttcag ggaggccggc aatggaggtg 

gccgcgtaca tggcgtggtt ctggcggatg 

cccgtgctcg gcagcctgcc gggcctggtg 

gccggcaacc tgcgccgcgc gggcggcacg 

gtggcgcgcc gcggcggcct ggtcaccgtc 

ctgaaggcgc gcttcgacaa ctaccccaag 

ctgctcggcg acggcatctt caattccgac 

gccgcgctcg agttcaccac ccgcacgctc 

tccatccacg gccgcctcct gcccatcctg 

gatctccagg acctcctcct ccgcctcacc 

aaggacccgg agacgctcgc ccagggcctg 

cgcgccaccg aggccacgct caaccgcttc 

aagtggctgg gcctcggcat ggagaccacg 

tacctcgccg ccgtcatcaa gaagcgcaag 

gacacggcgg cgacgcacga cgacctgctc 

gacgagtcgc tccagcacgt ggcgctcaac 

gtggcgctct cctggttctt ctggctcgtg 

gtgcgcgagc tctgctccgt tctcgccgcg 

ctggcggagc ccttcacctt cgaggagctc 

tcggagaccc tccgcctcta cccctccgtc 

gactacctcc ccgacggcac cttcgtgccg 

tcggcggggc gcatgaaggg ggtgtggggg 

tggctgtcgg ccgacggcac caagttcgag 

aacgccgggc cgagggtgtg cctgggcaag 

gccgggagcg tgctgctccg gcaccgcctg 

aagatgtcgc tcacgctctt catgaagggc 

ctcgcccccg tcctcgacga gccctgcggc 

gcaagtgcca cagcgccgtg cgcgtagaag 

cgtgcgtgct agctgttgaa gggacgccgg 



I 



acattgaatg tgtagatagg gcagcagtgc aagaccgtaa gtaaaattga tgatgggttt 1860 

ggtgacaaca ttgaagccac tcctttccag aatttacgac ccggatagga gaaacaggga 1920 

aactttgcag atcacaacac aagatctagc cagccgggga tctgatctga tttgcgtctg 1980 

ctcggagcac gggtgcatgg gagaccaagg aggaaaacaa aaaataacag aaacagagtg 2040 

agcaatattt gtgattgtag ccacgggaaa gagagaggag taattagtaa ttcagatttg 2100 

tttgcagtag ctcggtgttg gtgaccagat catagccaac taggctattc tattctattc 2160 

tatttttgaa gatgattttt c 2181 

<210> 11 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic primer 

; . <400> 11 

atatatggat ccatggaggt ggggacgtgg gcggtggtg 3 9 

j- <210> 12 

<211> 150 
<212> DNA 

<213> Artificial Sequence 

b <220> 

<223> Synthetic primer 

<400> 12 

atatatggat ccatggaagt tggtacttgg gctgttgttg tttctgctgt tgctgcttat 60 

Jrfggcttggt tttggagaat gtctagaggt ttgagaggtc caagagtttg gccagttttg 120 

ggttctttgc caggcctggt gcagcacgcc 150 

<210> 13 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic primer 
<400> 13 

tatatagaat tccttctacg cgcacggcgc tgtggcactt gc 42 

<210> 14 
<211> 1626 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 
<400> 14 

atggaagttg gtacttgggc tgttgttgtt tctgctgttg ctgcttatat ggcttggttt 60 

tggagaatgt ctagaggttt gagaggtcca agagtttggc cagttttggg ttctttgcca 12 0 

ggcctggtgc agcacgccga ggacatgcac gagtggatcg ccggcaacct gcgccgcgcg 18 0 

ggcggcacgt accagacctg catcttcgcc gtgcccgggg tggcgcgccg cggcggcctg 24 0 

gtcaccgtca cctgcgaccc gcgcaacctg gagcacgtcc tgaaggcgcg cttcgacaac 3 00 

taccccaagg gccccttctg gcacggcgtc ttccgggacc tgctcggcga cggcatcttc 360 

aattccgacg gcgacacctg gctcgcgcag cgcaagacgg ccgcgctcga gttcaccacc 420 

cgcacgctcc ggacggccat gtcccgctgg gtctcgcgct ccatccacgg ccgcctcctg 480 

cccatcctgg ccgacgcggc caagggcaag gcgcaggtgg atctccagga cctcctcctc 54 0 

cgcctcacct tcgacaacat ctgcggcctg gccttcggca aggacccgga gacgctcgcc 600 

gjagggcctgc cggagaacga gttcgcctcc gcgttcgacc gcgccaccga ggccacgctc 660 

ytaccgcttca tcttcccgga gttcctgtgg cgctgcaaaa agtggctggg cctcggcatg 72 0 

gagaccacgc tgaccagcag catggcccac gtcgaccagt acctcgccgc cgtcatcaag 780 

pfeigcgcaagc tcgagctcgc cgccggcaac ggcaaatgcg acacggcggc gacgcacgac 84 0 

Sfkcctgctct cccggttcat gcggaagggt tcctactcgg acgagtcgct ccagcacgtg 900 

glpgctcaact tcatcctcgc cggccgcgac acctcctccg tggcgctctc ctggttcttc 960 

tggctcgtgt ccacccaccc tgcggtggag cgcaagatcg tgcgcgagct ctgctccgtt 1020 

ctcgccgcgt cacggggcgc ccatgacccg gcattgtggc tggcggagcc cttcaccttc 1080 

gfeggagctcg accgcctggt ctacctcaag gcggcgctgt cggagaccct ccgcctctac 1140 

bcctccgtcc ccgaggactc caagcacgtc gtcgcggacg actacctccc cgacggcacc 1200 

fctcgtgccgg ccgggtcgtc ggtcacctac tccatatact cggcggggcg catgaagggg 1260 

ptgtg99999 aggactgcct cgagttccgg ccggagcgat ggctgtcggc cgacggcacc 1320 

p^gttcgagc agcacgactc gtacaagttc gtggcgttca acgccgggcc gagggtgtgc 1380 

ctgggcaagg acctagccta cctgcagatg aagaacatcg ccgggagcgt gctgctccgg 1440 

caccgcctga ccgtggcgcc gggccaccgc gtggagcaga agatgtcgct cacgctcttc 1500 

atgaagggcg ggctacggat ggaggtacgt ccgcgcgacc tcgcccccgt cctcgacgag 1560 

ccctgcggcc tggacgccgg cgccgccacc gccgccgcag caagtgccac agcgccgtgc 162 0 

gcgtag 1626 

<210> 15 
<211> 501 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 



<400> 15 



I 



Met Asp 


Val 


Leu Leu 


Leu 


Glu Lys Ala 


Leu 


Leu 


Gly 


Leu 


Phe 


Ala 


Ala 


1 




5 






10 










15 




Ala Val 


Leu 


Ala He Ala Val Ala Lys 


Leu 


Thr 


Gly 


Lys 


Arg 


Phe 


Arg 






20 




25 










30 






Leu Pro Pro Gly Pro 


Ser Gly Ala Pro 


He 


Val 


Gly 


Asn 


Trp 


Leu 


Gin 




35 






40 








45 








Val Gly Asp Asp Leu 


Asn 


His Arg Asn 


Leu 


Met 


Gly 


Leu 


Ala 


Lys 


Arg 


50 








55 






60 










Phe Gly Glu 


Val Phe 


Leu Leu Arg Met Gly Val 


Arg 


Asn 


Leu 


Val 


Val 


65 






70 






75 










80 


Val Ser 


Ser 


Pro Glu 


Leu 


Ala Lys Glu 


Val 


Leu 


His 


Thr 


Gin 


Gly 


Val 






85 






90 










95 




Glu Phe Gly Ser Arg Thr Arg Asn Val Val 


Phe 


Asp 


He 


Phe 


Thr 


Gly 






100 




105 










110 






Lys Gly Gin Asp Met 


Val 


Phe Thr Val 


Tyr Gly 


Asp 


His 


Trp 


Arg 


Lys 




115 






12 0 








125 








Met Arg Arg 


He Met 


Thr 


Val Pro Phe 


Phe 


Thr 


Asn 


Lys 


Val 


Val 


Ala 


m 130 








135 






140 










Gin Asn Arg 


Val Gly Trp Glu Glu Glu 


Ala 


Arg 


Leu 


Val 


Val 


Glu 


Asp 


tl5 






150 






155 










160 


Ifiu Lys Ala Asp Pro Ala Ala Ala Thr Ala Gly 


Val 


Val 


Val 


Arg 


Arg 






165 






170 










175 




Arg Leu 


Gin 


Leu Met 


Met 


Tyr Asn Asp 


Met 


Phe 


Arg 


lie 


Met 


Phe 


Asp 




180 




185 










190 






|ffg Arg 


Phe 


Glu Ser 


Val 


Ala Asp Pro 


Leu 


Phe 


Asn 


Gin 


Leu 


Lys 


Ala 




195 






200 








205 








E|u Asn 


Ala 


Glu Arg 


Ser 


He Leu Ser 


Gin 


Ser 


Phe 


Asp 


Tyr 


Asn 


Tyr 


H 210 








215 






220 










ily Asp 


Phe 


He Pro 


Val 


Leu Arg Pro 


Phe 


Leu 


Arg 


Arg 


Tyr 


Leu 


Asn 


2^25 






230 






235 










240 


Arg Cys 


Thr 


Asn Leu 


Lys 


Thr Lys Arg 


Met 


Lys 


Val 


Phe 


Glu 


Asp 


His 




245 






250 










255 




Phe Val 


Gin 


Gin Arg 


Lys 


Glu Ala Leu 


Glu 


Lys 


Thr 


Gly 


Glu 


He 


Arg 






260 




265 










270 






Cys Ala 


Met 


Asp His 


He 


Leu Glu Ala Glu Arg 


Lvs 


Gly 


Glu 


He 


Asn 


275 






280 








285 








Hxs Asp 


Asn 


Val Leu 


Tyr 


He Val Glu 


Asn 


He 


Asn 


Val 


Ala 


Ala 


He 


290 








295 






3 00 










Glu Thr 


Thr 


Leu Trp 


Ser 


He Glu Trp Gly Leu 


Ala 


Glu 


Leu 


Val 


Asn 


305 






310 






315 










320 


His Pro 


Glu 


He Gin 


Gin 


Lys Leu Arg 


Glu 


Glu 


He 


Val 


Ala 


Val 


Leu 






325 






330 










335 




Gly Ala Gly Val Ala 


Val 


Thr Glu Pro Asp 


Leu 


Glu 


Arg 


Leu 


Pro 


Tyr 



340 



345 



350 



I 



I 



Leu Gin Ser Val Val Lys Glu Thr Leu Arg Leu Arg Met Ala He Pro 

355 360 365 

Leu Leu Val Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr 

370 375 380 

Asp He Pro Ala Glu Ser Lys He Leu Val Asn Ala Trp Phe Leu Ala 
385 390 395 400 

Asn Asp Pro Lys Arg Trp Val Arg Ala Asp Glu Phe Arg Pro Glu Arg 

405 410 415 

Phe Leu Glu Glu Glu Lys Ala Val Glu Ala His Gly Asn Asp Phe Arg 

420 425 430 

Phe Val Pro Phe Gly Val Gly Arg Arg Ser Cys Pro Gly He He Leu 

435 440 - 445 

Ala Leu Pro He He Gly He Thr Leu Gly Arg Leu Val Gin Asn Phe 

450 455 460 

Gin Leu Leu Pro Pro Pro Gly Gin Asp Lys He Asp Thr Thr Glu Lys 
465 470 475 480 

Pro Gly Gin Phe Thr Asn Gin He Leu Lys His Ala Thr He Val Cys 

485 490 495 

Lys Pro Leu Glu Ala 
500 

<210> 16 
Oil <211> 501 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 
<400> 16 

Met Asp Val Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala 

1 5 10 15 

Ala Val Leu Ala He Ala Val Ala Lys Leu Thr Gly Lys Arg Phe Arg 

20 25 30 

Leu Pro Pro Gly Pro Ser Gly Ala Pro He Val Gly Asn Trp Leu Gin 

35 40 45 

Val Gly Asp Asp Leu Asn His Arg Asn Leu Met Gly Leu Ala Lys Arg 

50 55 60 

Phe Gly Glu Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val 
65 70 75 80 

Val Ser Ser Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val 

85 90 95 

Glu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp He Phe Thr Gly 

100 105 HO 

Lys Gly Gin Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys 



I 







115 










120 






125 








Met 


Arg 


Arg 


He 


Met 


Thr 


Val 


Pro 


Phe Phe 


Thr Asn 


Lys 


Val 


Val 


Ala 




130 










135 






140 










Gin 


Asn 


Arg 


Val 


Gly Trp 


Glu 


Glu 


Glu Ala 


Arg Leu 


Val 


Val 


Glu 


Asp 


145 










150 








155 








160 


Leu 


Lys 


Ala 


Asp 


Pro 


Ala Ala Ala Thr Ala Gly Val 


Val 


Val 


Arg 


Arg 








165 








170 








175 




Arg 


Leu 


Gin 


Leu 


Met 


Met 


Tyr Asn Asp Met 


Phe Arg 


lie 


Met 


Phe 


Asp 






180 










185 






190 






Arg 


Arg 


Phe 


Glu 


Ser 


Val 


Ala 


Asp 


Pro Leu 


Phe Asn 


Gin 


Leu 


Lys 


Ala 




195 










200 






205 








Leu 


Asn 


Ala 


Glu Arg 


Ser 


lie 


Leu 


Ser Gin 


Ser Phe 


Asp 


Tyr 


Asn 


Tyr 




210 










215 






220 










Gly 


Asp 


Phe 


He 


Pro 


Val 


Leu Arg 


Pro Phe 


Leu Arg 


Arg 


Tyr 


Leu 


Asn 


225 








230 








235 








240 


Arg 


Cys 


Thr 


Asn 


Leu 


Lys 


Thr 


Lys 


Arg Met 


Lys Val 


Phe 


Glu 


Asp 


His 


o 






245 








250 








255 




She 


Val 


Gin 


Gin Arg 


Lys 


Glu 


Ala 


Leu Glu 


Lys Thr 


Gly 


Glu 


He 


Arg 


ss 






260 










265 






270 






cys 


Ala 


Met 


Asp 


His 


He 


Leu Glu Ala Glu Arg Lys 


Gly 


Glu 


He 


Asn 




275 










280 






285 








His 


Asp 


Asn 


Val 


Leu 


Tyr 


He 


Val 


Glu Asn 


He Asn 


Val 


Ala 


Ala 


He 




290 










295 






300 










Glu 


Thr 


Thr 


Leu 


Trp 


Ser 


He 


Glu 


Trp Gly Leu Ala 


Glu 


Leu 


Val 


Asn 


3S5 










310 








315 








320 


Ms 


Pro 


Glu 


He 


Gin 


Gin 


Lys 


Leu 


Arg Glu 


Glu He 


Val 


Ala 


Val 


Leu 










325 








330 








335 




fly 


Ala 


Gly 


Val 


Ala 


Val 


Thr 


Glu 


Pro Asp 


Leu Glu 


Arg 


Leu 


Pro 


Tyr 








340 










345 






350 






Leu 


Gin 


Ser 


Val 


Val 


Lys 


Glu 


Thr 


Leu Arg 


Leu Arg 


Met 


Ala 


He 


Pro 






355 










360 






365 








Leu 


Leu 


Val 


Pro 


His 


Met 


Asn 


Leu 


Ser Asp 


Ala Lys 


Leu 


Ala 


Gly 


Tyr 




370 










375 






380 










Asp 


He 


Pro 


Ala 


Glu 


Ser 


Lys 


He 


Leu Val 


Asn Ala 


Trp 


Phe 


Leu 


Ala 


385 










390 








395 








400 


Asn 


Asp 


Pro 


Lys Arg 


Trp 


Val 


Arg 


Ala Asp 


Glu Phe 


Arg 


Pro 


Glu 


Arg 








405 








410 








415 




Phe 


Leu 


Glu 


Glu 


Glu 


Lys 


Ala 


Val 


Glu Ala 


His Gly 


Asn 


Asp 


Phe 


Arg 








420 










425 






430 






Phe 


Val 


Pro 


Phe Gly Val 


Gly Arg Arg Ser 


Cys Pro 


Gly 


He 


He 


Leu 






435 










440 






445 








Ala 


Leu 


Pro 


He 


He 


Gly 


He 


Thr 


Leu Gly Arg Leu 


Val 


Gin 


Asn 


Phe 




450 










455 






460 










Gin 


Leu 


Leu 


Pro 


Pro 


Pro 


Gly Gin Asp Lys 


He Asp 


Thr 


Thr 


Glu 


Lys 


465 










470 








475 








480 



I 



I 



Pro Gly Gin Phe Thr Asn Gin He Leu Lys His Ala Thr He Val Cys 

485 490 495 

Lys Pro Leu Glu Ala 
500 

<210> 17 
<211> 501 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 
<400> 17 

Met Asp Val Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala 

ila Val Leu Ala He Ala Val Ala Lys Leu Thr Gly Lys Arg Phe Arg 

CI 20 25 30 

|Ieu Pro Pro Gly Pro Ser Gly Ala Pro He Val Gly Asn Trp Leu Gin 

Li 35 40 45 

ill Gly Asp Asp Leu Asn His Arg Asn Leu Met Gly Leu Ala Lys Arg 

=50 55 60 

llie Gly Glu Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val 
|5 70 75 80 

fll Ser Ser Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val 
El 85 90 95 

Glu phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp He Phe Thr Gly 
O 100 105 HO 

Ly S Gly Gin Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys 

115 120 125 

Met Arg Arg He Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Ala 

130 135 140 

Gin Asn Arg Val Gly Trp Glu Glu Glu Ala Arg Leu Val Val Glu Asp 
145 150 155 160 

Leu Lys Ala Asp Pro Ala Ala Ala Thr Ala Gly Val Val Val Arg Arg 

165 170 175 

Arg Leu Gin Leu Met Met Tyr Asn Asp Met Phe Arg He Met Phe Asp 

180 185 190 

Arg Arg Phe Glu Ser Val Ala Asp Pro Leu Phe Asn Gin Leu Lys Ala 

195 200 205 

Leu Asn Ala Glu Arg Ser He Leu Ser Gin Ser Phe Asp Tyr Asn Tyr 

210 215 220 

Gly Asp Phe He Pro Val Leu Arg Pro Phe Leu Arg Arg Tyr Leu Asn 
225 230 235 240 

Arg Cys Thr Asn Leu Lys Thr Lys Arg Met Lys Val Phe Glu Asp His 



I 



I 



245 250 255 

Phe Val Gin Gin Arg Lys Glu Ala Leu Glu Lys Thr Gly Glu He Arg 

260 265 270 

Cys Ala Met Asp His He Leu Glu Ala Glu Arg Lys Gly Glu He Asn 

275 280 285 

His Asp Asn Val Leu Tyr He Val Glu Asn lie Asn Val Ala Ala He 

290 295 300 

Glu Thr Thr Leu Trp Ser He Glu Trp Gly Leu Ala Glu Leu Val Asn 
305 310 315 320 

His Pro Glu He Gin Gin Lys Leu Arg Glu Glu He Val Ala Val Leu 

325 330 335 

Gly Ala Gly Val Ala Val Thr Glu Pro Asp Leu Glu Arg Leu Pro Tyr 

340 345 350 

Leu Gin Ser Val Val Lys Glu Thr Leu Arg Leu Arg Met Ala He Pro 

355 360 365 

Leu Leu Val Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr 

370 375 380 

|; S p lie Pro Ala Glu Ser Lys He Leu Val Asn Ala Trp Phe Leu Ala 
|j85 390 395 400 

psn Asp Pro Lys Arg Trp Val Arg Ala Asp Glu Phe Arg Pro Glu Arg 
LI 405 410 415 

the Leu Glu Glu Glu Lys Ala Val Glu Ala His Gly Asn Asp Phe Arg 
O 42 0 42 5 43 0 

The Val Pro Phe Gly Val Gly Arg Arg Ser Cys Pro Gly He He Leu 
;f„ 435 440 445 

£la Leu Pro He He Gly lie Thr Leu Gly Arg Leu Val Gin Asn Phe 
C 450 455 460 

SlLn Leu Leu Pro Pro Pro Gly Gin Asp Lys He Asp Thr Thr Glu Lys 
©55 470 475 480 

Pro Gly Gin Phe Thr Asn Gin lie Leu Lys His Ala Thr He Val Cys 

485 490 495 

Lys Pro Leu Glu Ala 
500 

<210> 18 
<211> 501 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 
<400> 18 

Met Asp Val Leu Leu Leu Glu Lys Ala Leu Leu Gly Leu Phe Ala Ala 
1 5 10 15 



I 



I 



Ala Val Leu Ala lie Ala Val Ala Lys Leu Thr Gly Lys Arg Phe Arg 

20 25 30 

Leu Pro Pro Gly Pro Ser Gly Ala Pro lie Val Gly Asn Trp Leu Gin 

35 40 45 

Val Gly Asp Asp Leu Asn His Arg Asn Leu Met Gly Leu Ala Lys Arg 

50 55 60 

Phe Gly Glu Val Phe Leu Leu Arg Met Gly Val Arg Asn Leu Val Val 
65 70 75 80 

Val Ser Ser Pro Glu Leu Ala Lys Glu Val Leu His Thr Gin Gly Val 

85 90 95 

Glu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp lie Phe Thr Gly 

100 105 110 

Lys Gly Gin Asp Met Val Phe Thr Val Tyr Gly Asp His Trp Arg Lys 

115 120 125 

Met Arg Arg lie Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Ala 

130 135 140 

Clin Asn Arg Val Gly Trp Glu Glu Glu Ala Arg Leu Val Val Glu Asp 
;145 150 155 160 

,Leu Lys Ala Asp Pro Ala Ala Ala Thr Ala Gly Val Val Val Arg Arg 

165 170 175 

'"Arg Leu Gin Leu Met Met Tyr Asn Asp Met Phe Arg lie Met Phe Asp 
=13 180 185 190 

Arg Arg Phe Glu Ser Val Ala Asp Pro Leu Phe Asn Gin Leu Lys Ala 
* 195 200 205 

ffieu Asn Ala Glu Arg Ser lie Leu Ser Gin Ser Phe Asp Tyr Asn Tyr 

210 215 220 

[jjSly Asp Phe lie Pro Val Leu Arg Pro Phe Leu Arg Arg Tyr Leu Asn 
^225 230 235 240 

Surg Cys Thr Asn Leu Lys Thr Lys Arg Met Lys Val Phe Glu Asp His 

245 250 255 

Phe Val Gin Gin Arg Lys Glu Ala Leu Glu Lys Thr Gly Glu lie Arg 

260 265 270 

Cys Ala Met Asp His lie Leu Glu Ala Glu Arg Lys Gly Glu lie Asn 

275 280 285 

His Asp Asn Val Leu Tyr lie Val Glu Asn lie Asn Val Ala Ala lie 

290 295 300 

Glu Thr Thr Leu Trp Ser lie Glu Trp Gly Leu Ala Glu Leu Val Asn 
305 310 315 320 

His Pro Glu He Gin Gin Lys Leu Arg Glu Glu He Val Ala Val Leu 

325 330 335 

Gly Ala Gly Val Ala Val Thr Glu Pro Asp Leu Glu Arg Leu Pro Tyr 

340 345 350 

Leu Gin Ser Val Val Lys Glu Thr Leu Arg Leu Arg Met Ala He Pro 

355 360 365 

Leu Leu Val Pro His Met Asn Leu Ser Asp Ala Lys Leu Ala Gly Tyr 



I 



370 

Asp He Pro Ala 
385 

Asn Asp Pro Lys 

Phe Leu Glu Glu 
420 

Phe Val Pro Phe 
435 

Ala Leu Pro He 
450 

Gin Leu Leu Pro 
465 

Pro Gly Gin Phe 

Lys Pro Leu Glu 
500 



375 

Glu Ser Lys He 
390 

Arg Trp Val Arg 
405 

Glu Lys Ala Val 

Gly Val Gly Arg 
440 

He Gly He Thr 
455 

Pro Pro Gly Gin 
470 

Thr Asn Gin He 

485 

Ala 



380 

Leu Val Asn Ala 
395 

Ala Asp Glu Phe 
410 

Glu Ala His Gly 
425 

Arg Ser Cys Pro 

Leu Gly Arg Leu 
460 

Asp Lys He Asp 
475 

Leu Lys His Ala 
490 



Trp Phe Leu Ala 
400 

Arg Pro Glu Arg 
415 

Asn Asp Phe Arg 
430 

Gly He He Leu 
445 

Val Gin Asn Phe 

Thr Thr Glu Lys 
480 

Thr He Val Cys 
495 



<210> 19 
<211> 541 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 



<400> 19 
Met Glu Val Gly 
--1 

r Met Ala Trp Phe 
20 

Trp Pro Val Leu 
35 

Met His Glu Trp 
50 

Gin Thr Cys He 
65 

Val Thr Val Thr 

Arg Phe Asp Asn 
100 

Asp Leu Leu Gly 
115 

Ala Gin Arg Lys 
130 



Thr Trp Ala Val 
5 

Trp Arg Met Ser 

Gly Ser Leu Pro 
40 

He Ala Gly Asn 
55 

Phe Ala Val Pro 
70 

Cys Asp Pro Arg 
85 

Tyr Pro Lys Gly 

Asp Gly He Phe 
120 

Thr Ala Ala Leu 
135 



Val Val Ser Ala 
10 

Arg Gly Leu Arg 
25 

Gly Leu Val Gin 

Leu Arg Arg Ala 
60 

Gly Val Ala Arg 
75 

Asn Leu Glu His 
90 

Pro Phe Trp His 
105 

Asn Ser Asp Gly 

Glu Phe Thr Thr 
140 



Val Ala Ala Tyr 
15 

Gly Pro Arg Val 
30 

His Ala Glu Asp 
45 

Gly Gly Thr Tyr 

Arg Gly Gly Leu 
80 

Val Leu Lys Ala 
95 

Gly Val Phe Arg 
110 

Asp Thr Trp Leu 
125 

Arg Thr Leu Arg 



I 



I 



Thr Ala Met Ser Arg Trp Val Ser Arg Ser lie His Gly Arg Leu Leu 
145 150 155 160 

Pro lie Leu Ala Asp Ala Ala Lys Gly Lys Ala Gin Val Asp Leu Gin 

165 170 175 

Asp Leu Leu Leu Arg Leu Thr Phe Asp Asn lie Cys Gly Leu Ala Phe 

180 185 190 

Gly Lys Asp Pro Glu Thr Leu Ala Gin Gly Leu Pro Glu Asn Glu Phe 

195 200 205 

Ala Ser Ala Phe Asp Arg Ala Thr Glu Ala Thr Leu Asn Arg Phe lie 

210 215 220 

Phe Pro Glu Phe Leu Trp Arg Cys Lys Lys Trp Leu Gly Leu Gly Met 
225 230 235 240 

Glu Thr Thr Leu Thr Ser Ser Met Ala His Val Asp Gin Tyr Leu Ala 

245 250 255 

Ala Val lie Lys Lys Arg Lys Leu Glu Leu Ala Ala Gly Asn Gly Lys 
f1 260 265 270 

jfys Asp Thr Ala Ala Thr His Asp Asp Leu Leu Ser Arg Phe Met Arg 
SI 275 280 285 

teys Gly Ser Tyr Ser Asp Glu Ser Leu Gin His Val Ala Leu Asn Phe 
U 290 295 300 

lie Leu Ala Gly Arg Asp Thr Ser Ser Val Ala Leu Ser Trp Phe Phe 
305 310 315 320 

Trp Leu Val Ser Thr His Pro Ala Val Glu Arg Lys lie Val Arg Glu 
U 325 330 335 

iieu Cys Ser Val Leu Ala Ala Ser Arg Gly Ala His Asp Pro Ala Leu 
r* 340 345 350 

llrp Leu Ala Glu Pro Phe Thr Phe Glu Glu Leu Asp Arg Leu Val Tyr 
I! 355 360 365 

tieu Lys Ala Ala Leu Ser Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro 

370 375 380 

Glu Asp Ser Lys His Val Val Ala Asp Asp Tyr Leu Pro Asp Gly Thr 
385 390 395 400 

Phe Val Pro Ala Gly Ser Ser Val Thr Tyr Ser He Tyr Ser Ala Gly 

405 410 415 

Arg Met Lys Gly Val Trp Gly Glu Asp Cys Leu Glu Phe Arg Pro Glu 

420 425 430 

Arg Trp Leu Ser Ala Asp Gly Thr Lys Phe Glu Gin His Asp Ser Tyr 

435 440 445 

Lys Phe Val Ala Phe Asn Ala Gly Pro Arg Val Cys Leu Gly Lys Asp 

450 455 460 

Leu Ala Tyr Leu Gin Met Lys Asn He Ala Gly Ser Val Leu Leu Arg 
465 470 475 480 

His Arg Leu Thr Val Ala Pro Gly His Arg Val Glu Gin Lys Met Ser 

485 490 495 

Leu Thr Leu Phe Met Lys Gly Gly Leu Arg Met Glu Val Arg Pro Arg 



I 



I 



500 505 510 

Asp Leu Ala Pro Val.Leu Asp Glu Pro Cys Gly Leu Asp Ala Gly Ala 

515 520 525 

Ala Thr Ala Ala Ala Ala Ser Ala Thr Ala Pro Cys Ala 
530 535 540 

<210> 20 
<211> 541 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Altered sequences 



10 15 
Gly Leu Arg Gly Pro Arg Val 
30 

Leu Val Gin His Ala Glu Asp 
45 

Arg Arg Ala Gly Gly Thr Tyr 
60 

Val Ala Arg Arg Gly Gly Leu 

75 80 
Leu Glu His Val Leu Lys Ala 
90 95 
Phe Trp His Gly Val Phe Arg 
110 

Ser Asp Gly Asp Thr Trp Leu 
125 

Phe Thr Thr Arg Thr Leu Arg 
140 

Ser lie His Gly Arg Leu Leu 
155 160 
Lys Ala Gin Val Asp Leu Gin 
170 l" 75 
Asn He Cys Gly Leu Ala Phe 
190 

Gly Leu Pro Glu Asn Glu Phe 
205 

Ala Thr Leu Asn Arg Phe He 
220 

Lys Trp Leu Gly Leu Gly Met 
235 240 



Met 


Glu 


Val 


Gly 


Thr 


Trp 


Ala 


Val 


Val 










5 










Mi2t 


Ala 


Trp 


Phe 


Trp 


Arg 


Met 


Ser 


Arg 








20 










25 


Tfrp 


Pro 


Val 


Leu 


Gly 


Ser 


Leu 


Pro 


Gly 






35 










40 




ifet 


His 


Glu 


Trp 


He 


Ala 


Gly 


Asn 


Leu 




50 










55 






Sin 


Thr 


Cys 


He 


Phe 


Ala 


Val 


Pro 


Gly 


m 










70 








mi 


Thr 


Val 


Thr 


Cys 
85 


Asp 


Pro 


Arg 


Asn 


Wrg 


Phe 


Asp 


Asn 


Tyr 


Pro 


Lys 


Gly 


Pro 








100 










105 


Asp 


Leu 


Leu 


Gly 


Asp 


Gly 


He 


Phe 


Asn 




115 










120 




Ala 


Gin 


Arg 


Lys 


Thr 


Ala 


Ala 


Leu 


Glu 




130 










135 






Thr 


Ala 


Met 


Ser 


Arg 


Trp 


Val 


Ser 


Arg 


145 










150 








Pro 


He 


Leu 


Ala 


Asp 


Ala 


Ala 


Lys 


Gly 










165 










Asp 


Leu 


Leu 


Leu 


Arg 


Leu 


Thr 


Phe 


Asp 






180 










185 


Gly 


Lys 


Asp 


Pro 


Glu 


Thr 


Leu 


Ala 


Gin 




195 










200 




Ala 


Ser 


Ala 


Phe 


Asp 


Arg 


Ala 


Thr 


Glu 




210 










215 






Phe 


Pro 


Glu 


Phe 


Leu 


Trp 


Arg 


Cys 


Lys 



225 



I 



I 



Glu Thr Thr Leu 

Ala Val He Lys 
260 

Cys Asp Thr Ala 
275 

Lys Gly Ser Tyr 
290 

He Leu Ala Gly 
305 

Trp Leu Val Ser 

Leu Cys Ser Val 
340 

Trp Leu Ala Glu 
355 

■Leu Lys Ala Ala 
fl 370 

Jllu Asp Ser Lys 
|385 

Qghe Val Pro Ala 

Jlrg Met Lys Gly 
s 420 
jitrg Trp Leu Ser 
435 

fjys Phe Val Ala 

S 450 

fieu Ala Tyr Leu 
465 

His Arg Leu Thr 

Leu Thr Leu Phe 
500 

Asp Leu Ala Pro 
515 

Ala Thr Ala Ala 
530 



Thr Ser Ser Met 
245 

Lys Arg Lys Leu 

Ala Thr His Asp 
280 

Ser Asp Glu Ser 
295 

Arg Asp Thr Ser 
310 

Thr His Pro Ala 
325 

Leu Ala Ala Ser 

Pro Phe Thr Phe 
360 

Leu Ser Glu Thr 
375 

His Val Val Ala 
390 

Gly Ser Ser Val 
405 

Val Trp Gly Glu 

Ala Asp Gly Thr 
440 

Phe Asn Ala Gly 
455 

Gin Met Lys Asn 
470 

Val Ala Pro Gly 
485 

Met Lys Gly Gly 

Val Leu Asp Glu 
520 

Ala Ala Ser Ala 
535 



Ala His Val Asp 
250 

Glu Leu Ala Ala 
265 

Asp Leu Leu Ser 

Leu Gin His Val 
300 

Ser Val Ala Leu 
315 

Val Glu Arg Lys 
330 

Arg Gly Ala His 
345 

Glu Glu Leu Asp 

Leu Arg Leu Tyr 
380 

Asp Asp Tyr Leu 
395 

Thr Tyr Ser He 
410 

Asp Cys Leu Glu 
425 

Lys Phe Glu Gin 

Pro Arg Val Cys 
460 

He Ala Gly Ser 
475 

His Arg Val Glu 
490 

Leu Arg Met Glu 
505 

Pro Cys Gly Leu 

Thr Ala Pro Cys 
540 



Gin Tyr Leu Ala 
255 

Gly Asn Gly Lys 
270 

Arg Phe Met Arg 
285 

Ala Leu Asn Phe 

Ser Trp Phe Phe 
320 

He Val Arg Glu 
335 

Asp Pro Ala Leu 
350 

Arg Leu Val Tyr 
365 

Pro Ser Val Pro 

Pro Asp Gly Thr 
400 

Tyr Ser Ala Gly 
415 

Phe Arg Pro Glu 
430 

His Asp Ser Tyr 
445 

Leu Gly Lys Asp 

Val Leu Leu Arg 
480 

Gin Lys Met Ser 
495 

Val Arg Pro Arg 
510 

Asp Ala Gly Ala 

525 

Ala 



